Understanding criminal activities, their structure and dynamics are fundamental for designing tools for crime prediction that can also guide crime prevention. Here, we study crimes committed in city community areas based on police crime reports and demographic data for the City of Chicago collected over 16 consecutive years. Our goal is to understand how the network of city community areas shapes dynamics of criminal offenses and demographic characteristics of their inhabitants. Our results reveal the presence of criminal hot-spots and expose the dynamic nature of criminal activities. We identify the most influential features for forecasting the per capita crime rate in each community. Our results indicate that city community crime is driven by spatio-temporal dynamics since the number of crimes committed in the past among the spatial neighbors of each community area and in the community itself are the most important features in our predictive models. Moreover, certain urban characteristics appear to act as triggers for the spatial spreading of criminal activities. Using the k-Means clustering algorithm, we obtained three clearly separated clusters of community areas, each with different levels of crimes and unique demographic characteristics of the district’s inhabitants. Further, we demonstrate that crime predictive models incorporating both demographic characteristics of a community and its crime rate perform better than models relying only on one type of features. We develop predictive algorithms to forecast the number of future crimes in city community areas over the periods of one-month and one-year using varying sets of features. For one-month predictions using just the number of prior incidents as a feature, the critical length of historical data, τc, of 12 months arises. Using more than τc months ensures high accuracy of prediction, while using fewer months negatively impacts prediction quality. Using features based on demographic characteristics of the district’s inhabitants weakens this impact somewhat. We also forecast the number of crimes in each community area in the given year. Then, we study in which community area and over what period an increase in crime reduction funding in this area will yield the largest reduction of the crime in the entire city. Finally, we study and compare the performance of various supervised machine learning algorithms classifying reported crime incidents into the correct crime category. Using the temporal patterns of various crime categories improves the classification accuracy. The methodologies introduced here are general and can be applied to other cities for which data about criminal activities and demographics are available.