### Study area

Jinan City is the capital of Shandong Province and is famous for its spring city. It is also the political, economic, cultural, technological, educational and financial center of Shandong Province. Jinan City is located between 36°01′N-37°32′N and 116°11′E-117°44′E. Until 2020, Jinan has 12 county-level administrative regions, including ten municipal districts and two counties. They are Shizhong district, Lixia district, Huaiyin district, Tianqiao district, Licheng district, Changqing district, Zhangqiu district, Jiyang district, Laiwu district, Gangcheng district, Pingyin county, and Shanghe county. The total area of the city is 10,244km^{2}, the total area of the built-up region is 760.6km^{2}, and the urbanization rate is 71.21%. According to the latest census data, the registered population of Jinan reached 8.16 million, which ranked 27th in the country.

Jinan connects Beijing-Tianjin-Hebei region in the north, and Yangtze River Delta in the south. It is one of the important transportation hubs in East China. Because of springs, Jinan is a famous historical and cultural city and one of the first batch of excellent tourist cities in China. Jinan is also a national pilot city for the transformation of new and old kinetic energy^{27}. Thus, in the first three quarters of 2022, the GDP of Jinan (864.201 billion Chinese yuan) ranked 20th in entire China. In this study, we choose the downtown area and central area of Jinan as the study area, the map of this study is shown in Fig. 1.

### Data sources

*OpenStreetMap (OSM) Data*. OSM is one of road network data. It is easy to access and free for the public. OSM has high positioning accuracy and accurate topological relationship, the data includes longitude and latitude, road type, road name, maximum driving speed, and other attribute information. We downloaded the data from OSM website ( and processed the data using ArcGIS software.

*Point-of-Interest (POI) Data*. POI data is the point data of real geographical entities with spatial attribute information. Meanwhile, it has a number of advantages, such as large data size, wide coverage, easy to obtain, detailed information (e.g., names of buildings, categories of buildings, and coordinates) and timely updates. Because of those advantages, POI data is a widely used dataset in urban studies. Currently, there are a set of available POI resources, such as Baidu, Gaode and Tencent in China, as well as Google and OSM in western countries. In order to provide an accurate and convincing result, we used Gaode POI data because it has better data integrity in our study area. In this study, we collected 2020 POI data from Gaode ( Each POI record contains a series basic contents, for example, name, longitude and latitude, administrative region, and address.

The original POI data has three levels of categories (23 big categories, 267 mid categories, and 869 subcategories; We removed duplicate and missing records. Finally, 72,323 records in 2015 and 332,329 records in 2020 were obtained. Based on “Standards of Urban Land classification and Planning Construction Land” (GB50137-2011) and land use classification of Jinan City, we further reclassified the data into six new categories, including residential area, business area, industrial area, public service facilities, traffic facilities, green space and square (Table 1)^{28}.

*Other Datasets* Besides OSM and POI, we involved other datasets in verifying our results. The datasets include (1) boundary map of administrative zones. The boundary map of administrative zones of Jinan City was obtained from the Geographical Information Monitoring Cloud Platform ( (2) Remote sensing images. We collected the remote sensing images from Google Earth and Gaode in August 2015 and May 2020, the spatial resolution of those images is 30 m × 30 m. (3) Statistical yearbook data. We acquired those datasets from the Jinan Bureau of Statistics ( (4) Map of urban planning in Jinan.These maps mainly come from the related data of Jinan city government’s official website (http://gh.nrp.jinan.gov.cn/).

### Research methods

*Frequency Density and Ratio Index*. Frequency density (FD) is the most commonly used index in the study of the identification of urban functions. This index gives the frequency per unit for the data in this class, where the unit is the unit of measurement of the data. In this study, FD is used to calculate the total records of POI data in each identification unit and then get the frequency density and ratio index per unit. For each functional area unit, we identify the functional areas based on frequency density and ratio index. The formulas are^{29,30}:

$$F_i = \fracn_i N_i , i \in \left( 12…k \right)$$

(1)

$$C_i = \fracF_i \times W_i {{\sum\limits_i = 1^k \left( F_i \times W_i \right) }} \times 100\% i \in \left( 1,2,…,k \right)$$

(2)

where *i* indicates the type of POI, *k* is the total number of types of POI. *n*_{i} represents the number of type *i* POI in a unit, whereas *N*_{i} is the total number of type *i* POIs. *F*_{i} is the frequency density of type *i* POIs in a unit, and *C*_{i} is the proportion of the frequency density of type *i* POIs in a unit. *W*_{i} represents the weight for each frequency density (the general value is 1).

*Kernel density estimation (KDE) Method*. Kernel density estimation is the most important and intuitive method to measure the aggregation degree of POI data^{30}. It can not only clearly display the number of POI records in each unit, it also can compare the aggregation degree of POI records in different areas, and further visualize the results^{32,33}. The calculation of kernel density estimation is shown in Eq. (3):

$$f\left(x\right)=\sum _i=1^n\frac1\mathrmh^2\Phi \left(\fracx-c_i\mathrmh\right)$$

(3)

where *f(x)* is the estimated kernel density at *x*. *c*_{i} is the *i*th spatial location of POIs within the bandwidth. *h* is the threshold of distance decay, i.e., bandwidth. *n* is the number of POIs that the distance between *x* is less than or equal to *h*. *Φ* is a spatial weight function. In this study, we use the quartic weight equation (Eq. 4).

$$\Phi \left( \fracx – c_i h \right) = \frac34\left[ 1 – \frac(x – c_i )^2 h^2 \right]$$

(4)

*CA–Markov Model*. A CA–Markov model is a robust approach to spatial and temporal dynamic modeling of land use changes. Because the CA–Markov model absorbs the benefits from time series and spatial predictions of the Markov and CA theory, it can be used to carry out the spatial–temporal pattern stimulation^{34,35}. There are a series of studies that used CA–Markov model, please see^{35,36,37} for more detail about this model. In this study, we use Markov model to get land transfer area matrix in IDRISI software, use LOGISTICREG module to generate suitability maps, and use CA–Markov model to simulate land use patterns in the future^{38}.

*Information Entropy*. Shannon proposed information entropy in 1984. It is mainly a measurement of uncertainty, which describes the order degree of urban spatial structure. The value reflects the number of types of functional areas and the evenness of the distribution of different types of function areas. The more types of functional areas, the greater value of information entropy, and the lower the order degree of the functional area system. The equation is shown in Eq. (5)^{39,40}:

$$H = – \mathop \sum \limits_i = 1^n \fracA_i \mathop \sum \nolimits_i = 1^n A_i \ln \left( \fracA_i \mathop \sum \nolimits_i = 1^n A_i \right)$$

(5)

where *H* is information entropy (unit: Nat), *n* is the total types of urban functional area. *A*_{i} is the total area of each type, and *A* is the total land area.

Based on the equation of information entropy, we can define the equilibrium degree of spatial structure:

$$J = \fracHH_\max = \frac{{ – \mathop \sum \nolimits_i = 1^n \fracA_i \mathop \sum \nolimits_i = 1^n A_i \ln \left( {\frac{A_i }{\mathop \sum \nolimits_i = 1^n A_i }} \right)}}\ln n$$

(6)

where *J* is the equilibrium degree, H is the actual information entropy, and H_{max} is the maximum information entropy. When J = 0, the spatial structure is in the most unbalanced state; When J = 1, the spatial structure reaches the ideal equilibrium state. In the meantime, dominance degree (J) reflects the degree to which one or multiple types of functional areas dominate the region. It is opposite to the equilibrium degree. The equation is shown in Eq. (7).

*Mixing Degree Model*. The mixed-use urban functions typically represent different functions (two or more functions) within a unit. A typical example is commercial (lower stories are used for commercial functions) and residential (higher stories are used for residential functions) buildings. In order to address the shortcomings of FD, we apply revised information entropy to measure the degree of mixing for urban functions. We adopt ratio index into the conventional information entropy, which overcomes the shortcoming that determines the mixed-use urban functions only depending on FD and ratio index. The high value of revised information entropy stands high degree of mixing (more functions for one unit)^{41}, the equation of revised information entropy is:

$$M = – \sum\limits_i = 1^n \left( C_i \times \ln C_i \right)$$

(8)

where *M* is the degree of mixing. *n* is the number of POI types in one unit, *C*_{i} is the ratio index of POI types in one unit.

link