MULTI-LEVEL CITY PORTRAIT RESEARCH BASED ON MULTI-SOURCE DATA

: City portrait is a social impression generated by the interaction between the public and the city, which can help us better understand and perceive the nature and characteristics of the city, and thus provide strong support for the development and governance of the city. However, most existing studies extract thematic semantic labels globally, but ignore the order of the tags and the degree of their contribution in the topic, which affects the city portrait extraction results. In addition, existing studies also lack the analysis of the impact of grid areas as the study scale on city portraits. In this paper, we propose a new approach to accurately identify city labels based on multi-source data grid fusion using a topic feature word extraction model (Weight-LdaVecNet) with fused topic word embedding and network structure analysis with feature word weight constraints. On this basis, we construct a multi-level city portrait description framework using hierarchical cluster analysis, extract tag clusters, and obtain a similarity matrix by combining topic feature tags and region feature tags using similarity analysis to construct a multi-level city region portrait, with a view to achieving a fine-grained construction of a multi-level city portrait. The experimental results show that, compared with the traditional LDA model, our method indicates that the identified city labels with similar thematic semantics have strong aggregation, thus proving the effectiveness of our proposed method. In addition, in the overall multi-level city portrait, we find that Beijing has a strong attractiveness in terms of cultural features. However, the regional


INTRODUCTION
City portraits aim to use data to identify and understand city characteristics in order to improve people's living standards and promote urban development.Existing research on city portraits focuses on tourist portraits, destination portraits, and overall city portraits, but less attention has been paid to fine-grained multilevel city portraits exploration.Through techniques such as manual processing (Lu and Epchenkova, 2015), natural language processing and data mining ( Van Weerdenburg et al., 2019), public perceptions and emotional images of cities can be better understood.Several aspects of indicators and characteristics need to be considered when constructing city portraits, such as history and culture and level of economic development.However, existing studies have limitations, and one of the main problems is the lack of a more fine-grained approach to mining the hierarchical structure of city portraits.By exploring the hierarchical structure of city portraits in depth, it is possible to paint a more comprehensive picture of the characteristics and images of cities and to reflect the complexity and diversity within cities.
With the advent of the big data era, the volume of data has become richer, expanding urban research samples, and multisource data has become a mainstream trend in urban portrait research.POI data (Chen et al., 2021a), trajectory data (Park et al., 2020), and social media data (Nastar et al., 2019) and other multi-source data record multi-dimensional urban information from urban space to individual level.These data provide a novel * Corresponding author, Changfeng Jing jingcf@cugb.edu.cnway to realize three-dimensional, integrated, and multidimensional urban sensing and monitoring.However, the existing urban portrait studies lack the consideration of objective data, and it is a challenge in the current research field to optimize the urban portrait and identify the features of semantic information of urban elements.
The information generated from city data can be condensed to form city labels, which to a certain extent directly or indirectly translate into the attitude of city observers towards something.It can be seen that the quality of the process of condensing data information into city labels affects the impression of the city in the public perception perspective.To address this, significant progress has been made from manual classification to directly extract social labels to using machine learning techniques as well as deep learning models to extract semantic feature information, among which, topic models.Traditional label recognition methods are mainly manual processing to extract network social labels (Gupta et al., 2011), TF-IDF to extract hotspot information (Zhu et al., 2019), however, for the semantic information that will exist between large datasets will reduce the recognition effect.Thus, machine learning methods such as mainly using LDA models (Liang et al., 2021),improved LDA models (Ekinci and İlhan Omurca, 2019) provide new directions to improve the accuracy of recognition results, but it is difficult to locally consider the impact of semantic ambiguity arising from word order on label recognition results.There are also deep learning methods such as combining word2vec models (Ma et al., 2023) to train datasets to further improve the quality of recognized tags.
However, these methods are still not precise enough for recognizing tags, and the influence of many factors, such as the context, order, and degree of contribution of semantic information, on the precise recognition of urban tags needs to be further considered globally and locally.In addition, the existing research scale units are suitable for data spaces presenting aggregation phenomena, e.g., urban hotspots (Peng et al., 2020), and tourist areas (Vu et al., 2015).However, a smaller study scale allows for a more accurate extraction of attribute overviews and a more comprehensive portrayal of the city.Therefore, choosing 1000 m grid division cells as the study scale can improve the analysis fine-grained and comprehensiveness.
To address the above problems, this study uses the complementary nature of multi-source data to fuse multi-source data with grid partitioning, and proposes a topic feature word extraction model (Weight-LdaVecNet) based on feature word weight constraint that combines topic word embedding and network structure analysis for accurate identification of city labels, based on which a multi-level city portrait description framework is constructed using hierarchical cluster analysis.Based on this, a multi-level city portrait description framework is constructed using hierarchical clustering analysis, and the clusters of labels are generated, and the similarity matrix is obtained by combining topic feature labels and region feature labels using similarity analysis, and finally a hierarchical city region portrait is constructed to realize a multi-level city portrait.The main contributions of this study are summarized as follows: 1) Considering that objective data can provide rich city information, more objective data are introduced for optimization to realize the complementary of multi-source data.
2) The proposed Weight-LdaVecNet model has strong aggregation in identifying city tags with similar thematic semantics, which improves the accuracy of city tag identification.
3) The regional thematic features are introduced to build a hierarchical city-region portrait to realize a more fine-grained and comprehensive multi-level city portrait.

Status of Multi-source Data Mining
With the application of mobile network makes urban data accumulate huge amount of information on culture, function and characteristics.Mining multiple sources of data for urban research has been a characteristic of research.In recent years, the massive and diverse nature of geographic big data has led scholars to perceive the dynamic changes of cities from geographic location data and to explore the structural portrait of cities.However, these studies are only limited to geographic location data to reveal the movement patterns of group human flow and logistics, and portray the city structure portrait from a macro perspective, but they cannot serve the upgrading of urban development needs more accurately and quickly through thematic or labeled data.In order to further develop a more comprehensive and accurate portrait of the city, scholars have begun to study the connection between "hashtags" and "city portraits", and use the data expressed by the public to build city portraits by combining the text and images published by the public in social media data.This has become a hot research topic in recent years.Some scholars have used travelogue text data to mine tourism topic tags to realize the dynamic evolution of city tourism topic portraits (Ye and Xu, 2020).Although using tags to construct city portraits improves comprehensiveness, these studies mainly rely on subjective data, such as microblog checkin text data and Zhihu comment data, and lack objective data support.In addition, these data only represent the majority of public perceptions, which may lead to incomplete imagery carriers and inadequate interpretation of city portraits.In contrast, POI data cover rich information and large sample size, which can reflect all kinds of urban activities to a certain extent (Ma et al., 2020).In addition, urban land use data is one of the important data to describe the characteristics and development trend of cities, and it can provide information on building and land use types and urban spatial structure of cities.Therefore, the use of POI data and urban land use data can enhance the semantic information of the city, mine richer spatial-temporal big data semantic information from both subjective and objective aspects, and provide a solid data foundation for building a multidimensional city portrait.
In summary, it can be seen that there are relatively few studies on the fusion of subjective and objective data for city portraits and the application of city portraits to urban refinement physical examination.It can be seen that there is an urgent need to solve the problem of how to combine the advantages of subjective and objective data to mine their semantic information to realize the accurate identification of city labels, so as to build a structured city portrait.

Status of Urban Tag Extracting
City tagging refers to our mining of textual information for the topics they imply and the key words that summarize the center of the text in each city.Earlier TF-IDF method (Ali and Qaiser, 2018) was used to extract feature words by evaluating the importance of words in the text, but this algorithm ignores the word order information and the association between words, and the data sparsity problem occurs when the dimensionality of this vector is too high.Topic models such as LDA can reduce the spatial dimensionality and solve the data sparsity problem.Therefore, most existing studies use LDA to implement text feature extraction (Chen et al., 2021b), topic label evolution mining (Jiang et al., 2019), destination image description (Xiao et al., 2020), and so on.Although LDA models are able to mine shallow semantic letters from documents well.To address this problem, many studies have also improved text classification accuracy to some extent by improving the LDA model or combining other neural network models (e.g., Word2Vec (Ma et al., 2023) for text data processing.Wang et al. combined the LDA model and word2vec model to build latent semantic mapping between documents and topics and extract the contextual relationships between different words in short text to improve the fine-grained short text topics (Wang et al., 2016).Zeng et al. combined LDA model, Word2Vec model and PageRank to achieve secondary mining of keywords (Zeng et al., 2019) et al.
It can be seen that by introducing the Word2Vec model can well mine the deep semantic information of words locally, which exactly makes up for the shortage of LDA model.Therefore, combining LDA and Word2Vec can better mine the semantic information among words in depth, but the influence of the degree of contribution of words in the thematic context on the accuracy of keyword extraction is not considered.
It was verified that combining the Word2Vec model and the LDA model can mine the semantic relationships between tags.However, since the method ignores the degree of contribution of words in each topic context, using the feature words generated by the LDA model combined with the Word2vec model to achieve word embedding and weighting with feature word weights to constrain the similarity between words can improve the accuracy of topic word extraction and extract more representative city tags.

Status of City Portrait Constructing
Currently, city portraits research is a field of great interest, involving vertical dimensions and multi-source data fusion mechanisms.Existing studies mainly use LDA models to extract city portraits or a certain destination portrait (e.g., tourist object attraction portrait (Wen et al., 2022)), but there is a shortage of multidimensional exploration.Tag similarity and hierarchical clustering algorithms can characterize cities from multiple dimensions and levels, but require manual division of topics and have some subjectivity (Bi et al., 2019;Shi et al., 2021).In response to this, some scholars have used LDA models to extract thematic features based on the framework of tourist place imagery perception research, and semi-quantitatively portrayed tourist place imagery perception features in terms of feature dimensions.Although this study explored city thematic portraits from multiple dimensions and reduced the subjectivity of LDA thematic condensation, it could not automatically generate a multi-level city portrait framework (Liang and Li, 2020).In addition, compared to existing city portraits where the research scale focuses on the overall city or hotspot areas (Xie et al., 2017), grid area division can divide the city into homogeneous areas, which is beneficial for researchers to finely examine the details and differences within the city, thus the impact of grid area feature labeling on exploring multi-level city portraits needs to be considered in depth.
It can be seen that the construction of city portraits is crucial for city refinement management, and the hierarchical clustering algorithm is worthy of reference but needs to consider the influence of grid area features, which can be scaled down to the grid area level and introduce grid area features to build finergrained, hierarchical city area portraits to better understand the differences among city areas and provide targeted suggestions.

Study Area:
In this study we selected the area within the fifth ring road of Beijing, China as the study area (Figure 1), with an area of 667.89 square kilometers.This study area belongs to the densest area of commercial activities, cultural exchanges and human traffic in Beijing, and the urban data are complex, diverse and comprehensive.The complex diversity of data in this study area facilitates the extension of this study methodology to other areas.

Materials and Pre-processing:
In recent years, urban semantic data has increased in relevance to geographic locations, such as consumer reviews and real-time social media information, as well as urban land use and POI data.Urban land use data has more macro and comprehensive characteristics, while POI data is richer and more diverse, which helps to improve the semantic information of social perception semantic data.In this paper, three kinds of semantic data with urban semantic information are utilized: POI data, Weibo check-in data and urban land use data to construct urban area portraits.About 300,300 POI data were collected from Amap using crawler technique, and the secondary categories of all POI data were selected for pre-processing, and the POI data were assigned to the corresponding grid areas through spatial connectivity.In addition, 179,052 microblog check-in data points were collected from the largest microblog social media platform in China, and data cleaning and spatial linkage were performed.Finally, urban land use data were also collected and processed for secondary classification, and matched to the corresponding grid areas using spatial connectivity.Finally, the grid was used as the basic unit for multi-source data hosting and analysis, and data from multiple sources were integrated on the grid areas using spatial connectivity and merged, thus forming a more comprehensive and useful dataset to provide more accurate and useful data support for urban portraits.

Methods
This study adopts the research framework shown in Figure 2 below, which mainly contains three modules: data pre-processing, city label extraction and multi-level city portrait construction.Firstly, POI data, microblog check-in data and land use data are spatially connected to generate grid area documents; secondly, this study proposes a topic feature word extraction model (Weight-LdaVecNet) based on feature word weight constraint incorporating topic word embedding and network structure analysis, which incorporates LDA model, Word2vec model and PageRank method to accurately identify city tags; finally, a multi-level city portrait description framework is generated using hierarchical clustering algorithm, based on which, tag clusters are generated, and the similarity matrix is obtained by combining topic feature tags and region feature tags using similarity analysis to finally construct a multi-level city region portrait.

LDA model:
Latent Dirichlet Allocation (LDA) model has been widely used in semantic information mining.It is an unsupervised machine learning technique, which can also be called document topic generation model, that identifies latent topic information in large-scale documents using a three-layer Bayesian probabilistic model containing words, topics, and documents (Jing et al., 2022).The model mainly obtains the document-topic probability distribution and topic-word probability distribution by training the corpus (Gao et al., 2017).Each document represents a probability distribution consisting of multiple topics, and each topic represents a probability distribution consisting of multiple words.
The calculation results of the document topic probability process based on the LDA model are shown in Figure 3(a), where M denotes the total number of documents, N denotes the total number of words in the mth document.η is the Dirichlet prior parameter of the polynomial distribution of words under each topic, α is the Dirichlet prior parameter of the polynomial distribution of topics under each document, z is the topic of the nth word in the mth document, w is the nth word in the mth document words in the mth document, and two hidden variables θ and β denote the distribution of topics under the mth document and the distribution of words under the kth topic, respectively.The city tag extraction study is similar to the text semantic study.As shown in Figure 3(b).A study area is considered as a document, and all the multi-source data information in the area is considered as words in the document, and the distribution of city tags in the area corresponds to the distribution of topics in the document.Therefore, the concept of the LDA model is applicable to the identification of city tags, which in turn is applied to the domain of city portraits.

Word2Vec model:
Word2Vec word vector model is a three-layer neural network model with "input layer-hidden layeroutput layer", mainly used for textual word vector learning, with two types of learning methods, CBOW and Skip-gram (Ma et al., 2023).The CBOW model gives us a predicted probability distribution of nearby words based on the input of a central word and thus predicts the target word, while the Skip-gram model predicts the probability distribution of surrounding words based on the target word.Unlike the LDA topic model, the Word2Vec model can contain semantically related lexical feature vectors of neighboured words, which can compensate for the semantic ambiguity caused by the lack of consideration of word order in the feature representation of the LDA model, while the weighting of the word vectors can mine the degree of their contribution.

Feature word extraction based on network propagation:
The network structure can effectively express the relationship between nodes, and also the metric network structure can visually reflect the importance of nodes.Therefore, we can use the network structure to network the words under the same topic and find out the key nodes through the network structure analysis to obtain the key words of the topic.
Cosine similarity is a commonly used method to measure the similarity between two vectors, which can be widely applied in the field of natural language processing (Sarwar et al., 2022).Therefore, the similarity calculation can be performed based on the keywords and their word vector representations in each topic, and the cosine value is used to measure the similarity between words and words to generate a thematic keyword network.The thematic keyword network is represented as a network of keywords in the kth topic, as shown in Equation (1).

G(TR
Where G(TR k ) denotes the network structure of words on topic k.W(TR k ) denotes the set of all words on that topic.S = {s|s = Sim(  ,   )} denotes the set of edge weights for similarity calculation.E{e|s > α} denotes the set of edges of G(TR k ), and if the edge weight s is less than the threshold α, the edge of the corresponding node can be deleted.

Hierarchical clustering algorithm:
Cohesive hierarchical clustering algorithm is a distance metric-based clustering algorithm that treats all data points as small independent clusters and gradually merges these clusters until eventually only one large cluster remains (Hettiarachchi et al., 2021).Compared with other clustering methods, it has the advantage of hierarchical aggregation characteristics, which is conducive to the exploration of the hierarchical nature of urban portraits.The basic process of constructing a framework for city portrait description using cohesive hierarchical clustering in this study includes the following steps: first, all the labels in the study area documents are analysed bottom-up to divide the clustering samples and identify the sample points among them.Then, each sample point is treated as a clustering task, and the set of labels is generated based on the distance (i.e., label similarity) between the spatial vectors of each topic label.Then, the two closest sets of labels are clustered and merged repeatedly until the iteration termination condition is satisfied.Finally, a city portrait description framework with multi-level structure is generated.
The framework can be used to classify all city labels and obtain multi-granularity label sets under different city portrait levels or dimensions.These multi-granularity tag sets are an important data source for portraying city portraits under different levels.

Topic Semantic Feature Extraction
In this study, after pre-processing POI data, microblog check-in data and land use type data with cleaning, word separation and spatial connectivity, the extracted grid region documents are used as a text collection to construct a corpus of LDA models and evaluate the goodness of the topic models based on confusion and semantic consistency indexes (Ma et al., 2020), and iteratively, after several experiments, we determine the optimal number of topics is 138, and the topic probability distributions and topic word distributions of cities and regions are finally obtained.These topic word distributions summarize the characteristics of cities well.To improve objectivity, we used confidence intervals to extract the feature words with topic word probabilities above or equal to the mean as the final topic feature words.
A combination of LDA model and Word2vec model is used to extract the topic feature words, where the Word2vec model considers the semantic association of neighboured words.By introducing the Word2vec model and using the topic feature word probability weighting, the extracted topic feature words are more accurate.In order to further explore the semantic relationship between topic feature words, we constructed a feature word network using similarity analysis and determined the edge between words using weight threshold filtering.The experiments in this study were conducted for several times to find the best similarity threshold, and finally found that the average value of similarity between keywords corresponding to each topic was generally higher when the similarity threshold was 0.1, and those below the similarity topic feature words are filtered, and then a high-quality feature word network is constructed.
We calculated the PR values of the network nodes using the PageRank algorithm, and used the confidence interval method for secondary screening to find higher quality feature words that can be used as city labels.From Table 1, it can be found that the feature labels extracted by using the Weight-LdaVecNet model tend to focus on a particular topic when they appear in front of the feature vector.This indicates that when our proposed method identifies a particular topic of city labels, the labels of similar topics usually cluster in front of the feature vector.For example, in the first group of topics, the main description is Beijing World Flower Grand View Garden, where 'flower' and 'Grand View' directly highlight the core location of the topic, 'white pagoda' and 'Wanchun' describe the climate and things in the location, and 'joy' and 'picture' depict the scene of people's spring tour, and this group of feature words can be summarized as the topic of the tourist attraction, compared with the LDA model that extracts In contrast, the LDA model extracts labels that are not related to the topic, e.g., 'parking' and 'company', which tend to interfere with the generalized description of the topic.Therefore, the results obtained by our research method are more accurate and meaningful, and provide strong support for urban portrait research.

Analysis of the semantic features of the overall city topic:
The Weight-LdaVecNet model can be used to obtain the overall topic distribution of the city.Different topic weights indicate their different degrees of contribution to the city, and different topics also reveal the co-occurrence pattern of the labels, i.e., the probability of occurrence of the city labels within each topic is different.In order to clearly represent the composition of feature labels of each topic, this paper summarizes the semantic features of topics in descending order of topic weight probability, and draws the topic feature labels under each topic into a word cloud, where the larger the word size indicates the higher the probability of its occurrence in the topic (Figure 4).As can be seen from Figure 4, the composition and probability of occurrence of the feature labels under different topics are different, and the semantic features of each topic can be initially judged by the feature labels with higher probability of occurrence.Some topics have distinct semantic features, such as topic 6 and topic 10 belongs to the topic of life services, topic 13 takes into account catering services and companies, reflecting the topic of business; topic 15 mainly describes the Wu Dao Ying Hutong in Beijing, reflecting the topic of famous places; in addition, topic 1, topic 2, topic 3 and other topics have semantic features that take into account In addition, Topic 1, Topic 2, Topic 3, etc. take into account various types of places, such as government agencies, scientific, educational and cultural places, social organizations, and residential areas, reflecting the diversity and complexity of Beijing, and demonstrating the richness and diversity of the city in political, cultural, social and residential aspects.

Analysis of semantic features of urban area topic:
Previous urban studies have focused on the portrayal of whole cities or key areas as a reflection of the general understanding of cities.However, such studies lack a more fine-grained description of the urban portrait.At the same time, most of the existing studies use grid area division to identify area categories, which can only provide a basic description of the area itself and can hardly reflect people's overall perception of the whole area.Therefore, the advantages of both aspects are combined to describe the city portrait from multiple perspectives in order to establish its overall concept.As shown in Figure 5, four typical regions, namely Summer Palace, Wangjing, Forbidden City and World Flower Grand View, were selected and word cloud maps were used to highlight the tag categories, thus better describing the topic of the region.In the attraction regions, the probability distributions of the main feature tags are more similar, indicating that the semantics of the regional topics are more focused and generalized.In the commercial region, however, 'company' is the most prominent feature tag, but other tags include a variety of categories such as green space leisure services, government agencies, and food and beverage services, indicating that the region's thematic semantics are more complex and diverse.Therefore, the regional portrait of commercial areas is more complex and diversified compared with that of attraction areas.

LDA Model
Weight-LdaVecNet Model   As can be seen from Table 2, the main features of describing cities focus on 'cultural landscape', 'cultural activities', 'cultural venues', 'educational places', etc.According to the specific feature labels, we can filter the city portrait descriptions under different hierarchical structures, just like the business cards of cities, they can highlight the personality and characteristics of cities.Although these city portrait descriptions cannot completely cover all features of cities, they can be calculated based on the results of label weights under specific levels or dimensions selected based on the city portrait description framework, so as to obtain city portrait descriptions of different dimensions and different granularity.It is of great significance in urban planning and management, shaping of urban characteristics and inheritance of urban culture.

Multi-level urban area portrait analysis:
In order to mine the city portrait more finely, the extracted tag clusters are analysed for similarity with the overall topic features, and the tag cluster-topic similarity matrix is obtained (Figure 7(a)).
From the figure, it can be found that cluster 8, which belongs to the dimensional feature of 'business environment', and cluster 7, which belongs to the dimensional feature of 'cultural landscape', have a greater correlation with several topics, indicating that the extracted thematic semantic features contain more commercial types of places that facilitate economic development such as 'companies' and 'Chinese restaurants', as well as cultural attractions and places that display various types of history and culture.And according to the similarity analysis between the overall topic semantic features extracted in 4.1 and the regional topic semantic features, the topic-region label similarity matrix was obtained (Figure 7(d)).It is found that the similarity between them is very high rather than equal, mainly because according to the principle of LDA model, the regional documents will choose the topic semantic features that match their own based on the topic probability from the overall topic semantics, but the maximum topic weight probability does not necessarily express the whole document information completely, so we choose that we need to weight the topic weights of the regional topic features before performing the similarity analysis to determine can get a more accurate similarity.
In order to identify the characteristic dimension to which a region belongs, this requires matching the acquired tag clustertopic matrix and the topic-region matrix, which can obtain a portrait of the city region with dimensional characteristics.This is shown in Figure 8 below.more prominent than the cultural dimension feature, because a region has a high mixture of POI data of types such as government agencies, scientific, educational and cultural places and residential areas, resulting in a higher probability of co-occurrence of these POI data in the documents of the region.The microblog check-in data mainly involves cultural activities such as tourism and daily sharing, and too much cultural activities will directly affect the degree of contribution of POI data in the overall study area documents, thus leading to the prominence of cultural topics in the overall city portrait.In the secondary feature dimension, in addition to the commercial environment dimension feature, areas with cultural landscape and venue features have more data than areas with educational place features.The layout of cultural venues needs to be better planned and balanced to improve life satisfaction.While educational resources need more planning for distribution in the region to meet people's needs.Therefore, there is a need for additional cultural venues or reallocation of existing resources and a more rational allocation of educational resources.

CONCLUSIONS
City portraits are effective tools to understand city profiles, reflect changes in city development and help urban planning and governance.For city managers and planners, it is crucial to build a multi-dimensional and multi-level city portrait.However, the current research on city label identification does not consider the degree of contribution of thematic semantic feature labels to the topics, which may affect the identification results of city labels.Meanwhile, regional thematic features are not introduced to explore city region portraits with hierarchy, which implies that more research is needed to achieve multidimensional and multi-level city portraits.To this end, this study proposes a Weight-LdaVecNet model to accurately identify city labels through multi-source data grid fusion.The model solves the problem that the contribution of labels to topics affects the identification of city labels and proves the reliability.Based on this, a multi-level city portrait description framework is constructed using hierarchical clustering analysis.Through this framework, it can be understood that the overall portrait of Beijing is mainly reflected in the cultural dimension features, indicating that famous monuments or historical sites such as 'Forbidden City', 'Tiananmen Square' and 'National Museum' attract attention and become a unique business card to promote this city.In addition, this study uses similarity to construct a similarity matrix of labels, overall thematic features and regional thematic features respectively, and extracts a city region portrait with hierarchy.The city-region portrait mainly shows a regional portrait with economic dimension features, which illustrates its rapid urbanization and high level of economic development.In addition, there are significant differences as well as uneven distribution in education level, cultural activities, cultural landscape and cultural venues among different regions, which require further rational allocation of urban resources and planning of urban layout.The analysis of the overall city portrait and the regional portrait shows that the representational characteristics of the two portraits are different, indicating that it is necessary to build a more finegrained and hierarchical city portrait, which can play an important guiding role for the city management and urban planning.In the future, it is necessary to introduce more multisource data to extract thematic semantic features, e.g., pedestrian flow data, trajectory data and other dynamic data, which is conducive to building a complete and comprehensive city portrait.

Figure 1 .
Figure 1.The location of the study area.

Figure 3 .
Figure 3. Figure (a) shows the latent Dirichlet allocation model; Figure (b) shows relationship between document-topic and citytags.

Figure 4 .
Figure 4. Word cloud distribution of topic feature words.

Figure 5 .
Figure 5. Portraits of typical regional portraits.From area A to area D are the Summer Palace, Wang Jing, the Forbidden City, and the World Floral Grand View. 4.2 Multi-level city portrait construction 4.2.1 Multi-level city portrait description framework generation: The study uses the Weight-LdaVecNet model to extract the set of city feature labels and uses cosine similarity to measure the similarity between them, which in turn constructs a distance matrix between the labels.The matrix aggregates the set of labels by cohesive hierarchical clustering algorithm, and multiple experiments yield better differentiation of clustering topic categories when 20 clusters are clustered.With reference to the existing city portrait research results, the topic categories of clusters are manually labelled to form a city portrait description framework with a hierarchical structure.Figure 6 shows that Beijing is divided into three major categories: culture, economy, and society, with the highest proportion of culture-related topics.Beijing's rich and colourful culture has a very important position because it has a long history and rich cultural heritage.The commercial environment has the second highest percentage, mainly in various types of establishments, indicating that Beijing is a comprehensive commercial city, which has a positive effect on economic development.

Figure 7 .
Figure 7. Figure(a) shows Cosine similarity of Tag Cluster-Topic; Figure(b) shows Cosine similarity of Tag Cluster-Topic.

Table 1 .
Comparison of feature tags results extracted by LDA model and Weight-LdaVecNet Mode.

Table 2 .
City portrait calculation results (Top10).1D and 2D represent the first dimension and second dimension respectively; C represents Culture; S represents Society; CL represents Cultural Landscapes; CV represents Cultural Venues; CA represents Cultural Activities; EP represents Educational Places; T stands for Topics.