Ordinary least square using python

To demonstrate an OLS applicaton using python, I have followed Example 2.3 from Woldridge’s (4e) book. I have downloaded the related data set (CEOSAL1)from here.

The salary equation

salary = {\beta}_0 + {\beta}_1 roe+ u

You can follow below code to perform the analysis:
import os
#setting up the working directory:
os.chdir('C:\\Users\\Admin\\...')
#importing required packages:
import numpy as np
import pandas as pd
import statsmodels.formula.api as sm
from sklearn.linear_model import LinearRegression
#reading the excel file as dataframe using pandas:
df = pd.read_excel("ceosal1.xls")
#viewing the data:
print(df.head(10))
#listing the variable names:
list(df.columns.values)
#printing the series of salary:
print(df.iloc[:, 0])
#printing the series of roe:
print(df.iloc[:, 3])
#creating another dataframe(df2) to perform OLS
df2 = pd.DataFrame({"salary": df.iloc[:, 0], "roe": df.iloc[:, 3]})
#running the regression and stocking the results into "result"
result = sm.ols(formula="salary ~ roe", data=df2).fit()
#printing the parameters:
print(result.params)
#printing the whole outputs:
print(result.summary())

You will end up with the same results as Woldridge:

regression

Leave a comment