How Do We Design a Good Online Course for Business Analytics?

Zhipeng Lei
4 min readFeb 27, 2021

Websites for a massive open online course (MOOC), such as Udacity, Udemy, and Coursera, provide learners a low-cost, highly efficient way to acquire knowledge. Business analytics, which relies mainly on computer coding, is a popular topic in MOOC websites. Considering the impact of the COVID-19 pandemic on online learning and the job market where plenty of job openings on business analytics, we can foresee that the demand for business analytics online courses will continuously rise.

In this blog, we aim to discover the secrete of designing an excellent online course for business analytics. We study the business analytics courses on the MOOC website, Udemy. Compared with MOOC websites like Udacity and Coursera, Udemy has many courses because it allows everyone to upload courses. The cost of learning Udemy courses is low as Udemy gives the $10 per course promotion regularly.

We try to answer the following questions:

· What are the features of the Business Analytics online course?

· How is the course content introduced?

· How is the instructor introduced?

· What are the factors impacting the enrollments of the Business Analytics online course?

The code of this blog can be found at https://github.com/leizhipeng/analyze_mooc_course_data

Pipeline

We use the package Selenium with Python to perform web scraping on the Udemy website. The list of 1005 courses is obtained from Udemy’s webpage for Business Analytics & Intelligence Courses. The information, such as enrollment number, course rating, and course description, is scrapped on each course’s web page. Hence, there are both numerical information and texts. We use Gensim to perform topic modeling on the course description and the instruction introduction.

Exploring Non-Text variables

The distribution of the enrollment numbers is very skewed with the maximum 342288, mean 5785, and standard deviation 21279. The 75% enrollment number is 3145, implying that most courses have enrollment numbers between 0 and 4000. We apply a log-transformation to the enrollment number for making the variable close to normal distribution.

Because Udemy regularly has the $10 per course promotion, the price variable does not have a significant variance. Meanwhile, the original price variable has a big variance with the maximum 199.99, the mean 15.97, and the standard deviation of 17.07.

The lecture number and the number of downloadable resources are highly skewed, with most values below 100 and 20, respectively.

Exploring Text variables

We plot the word cloud of course description to obtain a general impression of the text information. The popular terms are “power bi”, “excel”, “tableau”, and “sql”.

Latent Dirichlet allocation (LDA) topic modeling is used to find topics in the texts. For example, we plot the word clouds of topics in the course description and instructor information.

Analysis of Variables

With the Scikit-Learn Package, we set the enrollment number as the target variable and apply Ridge, Lasso, and decision tree regression models to fit the dataset. Among the three regressions used in this study, the decision tree regression model shows high accuracy.

  • The variables that have importance in the decision tree regression are lectures, five_stars, four_stars, three_stars, two stars, downloadable_resources, instructor_no_courses, descr_LDA_1, descr_LDA_2, instr_LDA_0, and instr_LDA_8.
  • The four topics can be interpreted as power bi and visualization, project report involving SQL, instructor in university, and many years of instructor experience.

Use ordinary least squares (OLS) model to check the impacts of independent variables on the dependent variable (enrollment number).

  • The variables, including original_price, lectures, five_stars, four_stars, three_stars, two stars, and downloadable_resources, show a significant impact on the dependent variable.
  • The significant variables are partially different from the important variables in the decision tree regression model.
  • From the coefficients of the independent variables, the original_price, lectures, five_stars, four_stars, three_stars, two stars, and downloadable_resources all positively impact the enrollment number.

Conclusion

  • To design an excellent online course for business analytics, we can increase the original price, lecture numbers, and downloadable resources to promote the enrollment of the course.
  • The course description should emphasize the project reporting and visualization technologies such as SQL, Excel, and power bi.
  • The instructor introduction should emphasize the instructor’s teaching experience, especially teaching business analytics in a university for an extended period.

--

--

Zhipeng Lei
0 Followers

An engineer and researcher with interests of numerical methods, deep learning, and computational mechanics.