LEFT JOIN in SQL

Remember the tables in this example. KEY VALUE 1 L1 4 L2 6 L3 8 L4 table_left KEY VALUE 4 R1 6 R2 9 R3 11 R4 table_right You would like to perform a LEFT JOIN in SQL. This means “take all the values from ‘table_lef’, and only the matching values from ‘table_right’.”. The resultContinue reading “LEFT JOIN in SQL”

Economic variable types by their measurement scales

The calculation of econometric models requires a good sense of understanding the data being used. This is especially true for being aware of the measurement scales of variables in the dataset. Let me walk you through the different measurement scales used in statistics: Nominal Scale Ordinal Scale Interval Scale Ratio Scale Let’s think of aContinue reading “Economic variable types by their measurement scales”

Linear regression using some of the popular software

I use the data (in the form of ‘Dataset1.xls’ or ‘Dataset1.txt’ both consist the same data ) from an online course “Econometrics: Methods and Applications” by Erasmus University Rotterdam. The data includes two variables “Price” and “Sales”, while the former is independent variable and the latter is dependent variable. Let’s get to it! First someContinue reading “Linear regression using some of the popular software”

Data Science Tools

1.Data Management ? : persist – retrieve = (SQL: MySQL, PostgreSQL; NoSQL: MongoDB, Apache Couch, Apache Cassandra; Fila Based: Hadoop HDFX; Cloud Based: Ceph, Elastic Search; Commercials: Oracle Database, Microsoft SQL Server, IBM DB2; AWS Amazon DynamoDB, Cloudant, IBM Db2) 2.Data Integration and Transformation: ETL – ELT = (Apache Airflow, KubeFlow, Apache Kafka, Apache Nifi,Continue reading “Data Science Tools”

You are an econometrician/economist, and you want to be a data scientist: have a look at my journey.

Applied econometrician, using mathematical models, statistical methods and data, in the framework of economic theory, creates econometric models which are used in policy decision or forecasting/projection purposes. See below diagram. Data scientists with his/her programming skills, and mathematical/statistical knowledge and domain/substantive expertise tries to answer questions using data. See below diagram. So, it seems econometricsContinue reading “You are an econometrician/economist, and you want to be a data scientist: have a look at my journey.”

Programming languages used by data scientists

Data scientists manage, extract, transform, analyze and visualize data using software. In this regard, SQL is essential. Data is stored in a data base and in every step of data management process some how lean on SQL whether they are clustered/ on the cloud or conventional. Python and R are among the highly used languagesContinue reading “Programming languages used by data scientists”

Big data analytic databases

When big data is concerned conventional database systems simply become not enough. Because, the data not only gets larger, but it also gets complicated as unstructured data comes into play. There are plenty of software enable data scientists to deal with big data. Many of them created by the Apache Software Foundation depending on theContinue reading “Big data analytic databases”

Some fundamental principles on RDBs

(As the name suggests) Relational databases are based on a foundation called ‘relation’ which is applied using a ‘table’. The structure of a table (named as schema or metadata) gives the information on every columns’ data type. Although they are not required for a table, ‘primary key’ and ‘foreign key’ are two important specialties columns may have. TheContinue reading “Some fundamental principles on RDBs”