Profile Image
  • Methods: Python
  • Data Analysis: Pandas, Numpy
  • Data Visualization: Matplotlib
  • Data Modeling:
  • Web Development: HTML, CSS

Exploring Covariance of Randomly Generated Data and Sales Data with Numpy.

  • The objective of this project is to explore the covariance of randomly generated data, as well as sales data in a python jupyter notebook.
  • The random data is generated with the random library. The sales data is imported from a csv file and contains total daily advertising spend and the total daily sales ($).
  • The positive covariance indicates the variables tend to move in the same direction. This means that when more was spent on advertising, the total daily sales incrased.
  

Results

The randomly generated data, data1 is an array of 1000 samples from a normal distribution with a mean of 100 and a standard deviation of 20.

-The randomly generated data, data2 is an array of 1000 samples that is a combination of a scaled version of data1 and another normal distribution with a mean of 500 and a standard deviation of 10.

-The mean and standard deviation for data1 and data2 were calculated as, data1: mean=100.713 stdv=20.280, and data2: mean=520.048 stdv=10.527.

-The csv file contains two columns, the column at the 0 index represents total amount spent on advertisement for a given day, the second column represents total sales for that same day. You will load in the file using numpy and display the two data fields as a scatter plot.

-The mean and standard deviation for data3 (Amount Spend on Advertising) and data4 (Total Daily Sales) were calculated as, Amount spent on Ads: mean = 99.095 and stdv = 19.741, and Total Daily Sales: mean = 519.955 and stdv = 10.340

-The positive covariance indicates the variables tend to move in the same direction. This means that when more was spent on advertising, the total daily sales incrased.


Contact

Please feel free to reach out through the following platforms: