Pandas excel/transpose/barplot/to_csv

Yeju Ham
2 min readMar 4, 2021

--

This is very basic things that everyone begins the Pandas should know.

  1. open the excel data
import pandas as pd
url = 'https://ds-lecture-data.s3.ap-northeast-2.amazonaws.com/stocks/Travel.xlsx'
df1 = pd.read_excel(url, sheet_name = '008770 ')
df2 = pd.read_excel(url, sheet_name = '035250')

df1 = df1.T
new_header = df1.iloc[0]
df1 = df1[1:]
df1.columns = new_header

df2 = df2.T
header_name = df2.iloc[0]
df2 = df2[1:]
df2.columns = header_name
  • open the excel file : df.read_excel(“ url ”, sheet_name= “ ”)
  • transpose the row and column : df.T / df.transpose()
  • when there is one more line as like (0 1 2 …) and you want to remove it : iloc[0]

2. check the data

df1.shape
df2.shape
df1.isnull().sum()
df2.isnull().sum()
df1 = df1.fillna(0)
df2 = df2.fillna(0)

!sudo apt-get install -y fonts-nanum
!sudo fc-cache -fv
!rm ~/.cache/matplotlib -rf

import matplotlib.pyplot as plt

plt.rc('font', family='NanumBarunGothic')

df1['FCF'].plot.bar()
df2['FCF'].plot.bar()
  • check the dimension of the data : df.shape
  • find where is N/A and replace it to 0

→ df.insull() : True/False, when it is N/A it returns True. No N/A = False

→ df.insull().sum() : summarize all True by each groups in the row.

  • replace N/A to certain number : df.fillna(‘number’)

then N/A is replaced to 0. So there is no N/A or blank in this data no more. It is represented as 0

3. df[‘ — ‘].plot.bar()

df1['FCF'].plot.bar()

extract a column FCF and visualize it with the bar plot.

4. python export csv

df1_t.to_csv('df1.csv',index =False)
df2_t.to_csv('df2.csv',index= False)

df.to_csv(‘file_name’, index = False)

--

--

Yeju Ham
Yeju Ham

Written by Yeju Ham

learner, writer, traveler, data science beginner with the whole passion

No responses yet