From the visualizations in Figure 1-3, we can see that Refuse waste occupies the largest proportion in all wastes. There is also seasonality in all five boroughs while there isn’t any obvious trend in these time series data. People tend to generate more waste during summer time than winter time. Moreover, due to the same-direction change of all these three types of wastes shown in the above visualization, there are high correlations (Figure 4) among the three types, which can also simplify our previous model of time series
The results of the six time series models mentioned in par are shown in Table 1. The real and predicted values are shown in Figure 6-8.
As we can see above, LSTM structure has captured the fluctuation of the time series successfully. All the five boros have similar trends as well as seasonality. The waste generation reached its maximum during the summer time while in winter it reduced to its lowest point. This pattern may suggest different manipulation on waste and circular economy in different seasons.
In the clustering analysis, the silhouette-score from Gaussian Mixture equals 0.3988. The algorithm provided two clusters that it calculated to have relations, Figure 7.1.2-1 is a geographic representation of the two clusters.
the first cluster(dark blue) contained most of the community districts that has lower household income, and the median household income in this cluster was $52,490 , and most of the community districts distributed in the 50k-65k range. (figure 11 in appendix).
After applying regression model to this cluster, the result shows that there was a positive correlation between income and refuse generation per capita (figure 12 in appendix) with coefficient=0.3964, and positive correlation between income and recyclable (paper + mix-recyclable) generation per capita (figure 13) with coefficient=0.8617. The second cluster (light yellow) contained most of the community districts that had higher household income and the median household income in this cluster was $105,616 (Figure 14).
The result of regression analysis showed differences between the second and the first cluster. There was a negative correlation between income and refuse generation per capita (Figure 15 in appendix) with coefficient=-0.557; and positive correlation between income and recyclable (paper + mix-recyclable) generation per capita (figure 15) with coefficient=0.7175.
From the above analysis result, we can conclude that Income does affect waste generation. In the lower income cluster, with the rise of household income, refuse generation per capita increases in tandem, so does recyclables (paper+mix-recyclables) generation per capita. But in the higher income cluster, refuse generation per capita decreases while household income rises even though the tandem of recyclables generation remains positive while income increases. In short, social economic background has different influences across various income levels, especially in refuse generation, but not much influences on recyclables generation.
Based on our time series model, it can be seen that all the boroughs except Manhattan had a surge in waste generation in COVID-19 periods, especially in June. This may be partly explained by the fact that people who lived outside Manhattan didn’t have to go to Manhattan for work during stay-at-home policy.
We applied the same concept as we used in section 7.1.2, Gaussian Mixture clustering with silhouette-score equals 0.4702. The algorithm provided two clusters that it calculated to have relations, figure 7.2.2-1 is a geographic representation of the two clusters, In the first cluster(dark blue), it contained community districts that had median households as $53,205, and most of the community districts distributed in the 50k-65k range. (figure 19)
The result of regression model shows that there was a moderate positive correlation between income and refuse generation per capita (figure 20) with coefficient=0.3638, and strong positive correlation between income and recyclable (paper + mix-recyclable) generation per capita (figure 21) with coefficient=0.8469.
The second cluster (light yellow) contained most of the community districts that had higher household income and the median household income in this cluster was $115,993 (figure 22).
The result of regression showed there was a strong negative correlation between income and refuse generation per capita (figure 23) with coefficient=-0.7784, and very weak negative correlation between income and recyclable (paper + mix-recyclable) generation per capita (figure 24 in appendix) with coefficient=-0.0136.