Feature Extraction of English Semantic Translation Relying on Graph Regular Knowledge Recognition Algorithm

,


Introduction
In the study of English semantic translation, we have selected the topic of extracting semantic translation features from the English translation.In recent years, as the global village boom continues, reading foreign language books has grown to be a need in peoples' everyday lives.People urgently need to understand its meaning and the story behind it.People's enthusiasm for English directly contributes to the development of foreign cultures, and English learning is also the most direct medium for English translators to focus on translation.However, the main foreign language activity reports and broadcasts are mainly conducted in English, and most of the domestic translation companies do not have enough ability to perform first-line accurate foreign language translation.The current agency's translation ability is relatively weak, and the requirements of domestic readers are becoming higher.Because of the imbalance between the two, people urgently need to translate the semantics of English, which is inevitable in the context of the development of world integration.English semantic translation is an important research topic in natural language processing.It is essential to acquire knowledge and analysis in the context of big data.As a critical portion of natural language, the English translation has the characteristics of ambiguity, diversity, and irregularity, which has caused great problems in understanding natural language.The activation signaling technology accurately links the entity reference to the corresponding entity concept based on the narration.It introduces rich background knowledge based on narrating the tasks related to natural language processing.Solving the problems caused by the above characteristics can more appropriately provide services to related academic research and production applications.As English has become one of the universal languages in the world, Chinese reform and opening up have also brought China into an era in which it is in line with international standards.English has also begun to flood China on a large scale, and foreign companies have also started to take root in China, accompanied by the urgency for English translation.An accurate translation can reduce a lot of trouble, making more and more people begin to invest in studying English semantic translation.Deng A said in his article that given a graph G with a vertex set V(G)=V and an edge set E(G)=E, let G(1) be a line graph and G(c) be the complement of G .Let G(0) be a graph with V(G(0))=V and no edges, G(1) has a complete graph of vertex set V, G(+)=G and G(-)=G(c), Let B(G)(Bc(G)) be the graph of the vertex set VbooleanORE, so that (ve) is an edge in B(G) (correspondingly, in Bc(G)) be vepsilon V, eepsilon E and vertices v, and G Event occurs on edge e in [1].Bitkina V V raised the issue of studying regular distance graphs.The neighborhood of the vertex is a strongly regular graph.For a given positive integer t, the second L. Yang eigenvalue is at most t.This issue is simplified to the description of a regular distance graph, where the neighborhood of the vertex is a strongly regular graph with non-principal eigenvalues t = 1, 2,... [2].Kaveh A believes that graph theory has a lot of applications in structural mechanics, as well as many topological transformations, to develop similar challenges easier.The skeleton diagram and natural correlation diagram of the finite element model are transformed in this way.These transformations can be effectively used in the ordering of nodes and elements of conventional limited element models.In his article, he proposed an effective method of using graphs and directed graph products to generate the skeleton graph, natural correlation graph, and its grid base of the finite element model [3].Here, he tried to make the cuckoo search (CS) algorithm parameters free without the Levy step.The algorithm he proposed uses 23 standard benchmark functions for verification [4].Sahoo SP offered an interest point detection technology based on the local maximum of difference images (LMDI), a selected projection tree with crossing segmentation, and a revised vote score for the acknowledgment of human action.In the interest point detection method based on LMDI, continuous frame difference technology is used to obtain different images.Then 3D peak detection is used to the calculated set of various images.Hough voting technology is applied to test videos to calculate the most significant correlation rating obtained when a single training class [5].Aiming at the current semantic irregularities in the field of relay protection, Qian H designed an intelligent semantic recognition algorithm for relay protection information based on four modules: dictionary management, semantic matching, retrieval preprocessing, and retrieval.The acquired standard semantic data is examined and confirmed by testing various non-standard semantic data.It is proved that the relay protection information semantic intelligent recognition algorithm has good performance and feasibility [6].
Bracken J believes that translation is usually not directly aligned across languages, and indirect mapping will decrease the accuracy of language learning.He came up with a brand-new ongoing measure to make the examination of this issue easier to quantify the semantic relevance of words with multiple translations.He determined how the correlation between translations affects the learning of translation ambiguities from German to English.Compared with German words with high TSV value, German words with minimum TSV value are noticed as slower and low accurate to translate [7].Tan Y W pointed out that due to the particularity of legal English, its translation differs from others' translations.Legal translation can be regarded as a dual operation of legal transfer and language transfer.Therefore, legal translation needs to consider many factors.Frame semantics is the perspective of translation, which provides a new view of legal translation.He proposed three legal translation strategies based on frame semantics.These three strategies are frame correspondence, selection, and transfer [8].The above-mentioned documents mainly involve the introduction of graph common knowledge, recognition algorithms, and English semantic translation.But most of them stay at the research level of the technical level, and not too much research goes deep into the application level.This makes the use of the technology still not clear enough, and the critical points of the relevant technology are still not enough, which leads to the lack of persuasiveness of the article.
The innovation of this article lies in the theoretical support of English semantic translation.At the same time, the feature selection of English semantic translation based on the regular low-rank score of the graph is used as the technical support, and the improved feature extraction recognition algorithm is experimentally explored through design experiments.At the same time, semantic translation and graph regularization are entity-linked, and the accuracy of semantic translation is compared and analyzed.After analysis and comparison, the improved English semantic translation model is 10%-15% higher more accurate than the conventional translation model, which ensures the accuracy and stable operation of translation.
The rest of the portion is structured as follows: part 2 describes the related works, part 3 discusses the methodology of the study, part 4 represents the Graph Regularization and Semantics Entity-Link Experiment, part 5 presents the efficiency analysis, and Part 5 concludes the study with the future work.

Related works
Builds a semantic mapping model for interactively optimum English-Chinese translation, creates an English translation model using a feature extraction technique, and works out the best translation strategy utilizing the newly proposed feature extraction algorithm.When put into reality, however, it becomes clear that this approach suffers from slow English translation time.This has resulted in a poor degree of translation efficiency being maintained [26].To enhance the quality of machine translation, the model combines the language-template-based translation approach with the statistical translation approach of the conditional random field to segment and analyze lengthy phrases along syntactic and statistical dimensions.Unfortunately, the model's implementation method is very complicated, which lengthens the time required to translate from English and reduces translation efficiency [27].Determining the set of points and their neighbors to form a subgraph.Finally, it calculates the likelihood of local support using the associated relationship acquired by sorting the edges of the two subgraphs based on distance and angle.Various synthetic and actual data were used to verify the suggested method's performance, demonstrating that it can enhance the resilience and accuracy of conventional methods [28].A novel probabilistic clustering approach designed to isolate linear groups in datasets.The algorithm is a method for maximizing a mixed probability density function, similar to expectation maximization.A line segment is modeled by each process.The suggested approach is on par with or superior to modern cluster-based methods and conventional line detection techniques in experimental assessments [29].The unlabeled text vocabulary is vectorized, it is accomplished by combining lexical representation with vector characteristics, and the valuable data about different phrases and their semantics is retrieved using a multilayer neural network model.To finish the construction of the English translation model, a neural network is utilized for Word evaluation and grading inside an online ranking framework, as well as for obtaining the semantic collection of the sample data and predicting the variation of word arrangement.However, English phrases are poorly recognized as parts of speech, leading to an inaccurate translation [30].A perceptive recognition based on the enhanced GLR algorithm, an English translation model.The results of part-of-speech recognition may be acquired by building the phrase structure via the phrase center, and the English and Chinese structural uncertainty in the part-of-speech recognition outcomes may be improved by the syntactic function of the analytical, linear table.To fulfill the design of the English translation model, the recognized content is finally collected.However, the model struggles to correctly detect the part of speech of English phrases, which harms the quality of the ensuing English translation [31].Interactive information retrieval issues may be solved using ontology-based techniques for semantic document recognition and representation.The presentation features interactive tools.The ontology is graphically represented by the device through the action of building aspect projections.From a visual and perceptual standpoint, this allows the graph's dimensionality to be reduced to a more manageable level.Keyword or shallow semantic parsing, the two most common efficient and reliable cipher text search techniques, cannot fully meet users' search intents [32].It outperforms most current RGB-D networks because of its high accuracy and fast inference speed of 22 Hz at full 2048 1024 resolution.The two types of retrieval strategies, text-based and knowledge-based, continue to be at odds with one another.Both fail to adequately handle keyword-based query and ranking retrieval, although the former ignores intricate connections [33].The issue of long-term dependencies was well-handled by the long short-term memory (LSTM) once gate functions were included in the cell structure.Since its inception, the LSTM has been responsible for almost all of the impressive achievements based on RNNs.Recently, deep learning has shifted its attention to LSTM.To investigate the LSTM cell's potential for learning, the study [34] conduct a systematic study of the LSTM cell and its variations.Sherlock, a multi-input deep neural network for semantic type detection, is presented in the research.By matching $78 in semantic types from DBpedia to column headings, the research also trained Sherlock on $686,765 in data columns that were pulled from the VizNet corpus.Each matched column in the research [35] is given $1,588 attributes that describe its statistical characteristics, character distributions, word embeddings, and paragraph vectors.

References
Methodology Drawbacks [28] Improved Non-Rigid Point Set

Registration Algorithm
The computational cost of non-rigid point set registration procedures may be high, especially when working with large-scale point sets or intricate deformations.
[30] The transformer-based neural machine translation system To perform well in translation, transformer models often need a lot of training data.
[31] Improved GLR Algorithm The enhanced GLR algorithm has trouble addressing frequent syntactic and semantic problems in natural language.
[32] Ontology graphs Scalability becomes a problem as the ontology graph's size and complexity rise.[33] Real-time fusion semantic 3 English semantic translation method

English semantic translation theory
This article explores the characteristics and countermeasures of English-Chinese translation under the guidance of text classification and semantic translation theory.The factors mainly include five types of words and sentences and context characteristics, and the countermeasures correspond to one of them.
According to the classification of text types, sports news with universal text characteristics should be classified as informational text.In other words, the translation of sports news should be based on communication translation.However, on the fundamentals of translating (1.Translation principles depends on the target language or the target language.2. The translation principle oriented towards the author and reader.3. Aesthetics-oriented translation principles), when the specific language form and content of the original text are equally important, it is also mentioned that semantic translation is needed when it has nothing to do with the type of the original text.When translating such texts, the primary function of news text is to convey information that cannot be ignored [9].
Based on translation theory, this article mainly studies and solves the following two aspects.Analyze the words, sentences, and context characteristics in the text, and explain the corresponding translation strategies.According to the author's translation conventions, this article summarizes the characteristics of almost all 5 kinds of words and sentences in English translation, which are professional terms, idioms, direct quotations, cultural words, and representative words borrowed [10].Based on these five characteristics, the translation countermeasures of strict observance of norms, obedience to the mainstream, credibility and expressiveness, specific analysis, and classification discussion are proposed, and the context in English translation is explored.The major categories of English translation are shown in Figure 1:

Types of texts and division of readers
German scholar Karl Charles [11] advocated the three functions of language, information function, expressive function, and infectious function, and divided text into three types: information type, expression type, and meaning type.At the same time, Newmark proposed three types of information text, formula type text, and call type text [12].As shown in Figure 2, there are three types of text.Information text explains objective events without personal contact, and information occupies a dominant position in most non-literary works.The pictorial text and text reflect entirely the original author's language style, thoughts, and feelings, as well as the form and content of the language.The author occupies a dominant position in serious literary works, and the circulated texts are aimed at attracting readers because readers occupy a dominant role in advertisements.

1) Language positioning of the target text
From the point of view of article types, it mainly covers the following two points.First of all, the general function of the text is to spread information and has auxiliary functions to call readers, and secondly, it is the expressive function of English text.Since some of the main work of the text is to spread information, according to the classification of the three text types of tokens, a part of the text first belongs to the information type [13].If a specific language form and a part of the text are equally important, then no matter what type of text, the text must emphasize the function of expression, and the importance of the meaning unit is very high, so meaning translation must be used [14].For meaningful translation, we should focus on the following points: 1. Highlight the subject.The topic of a statement is crucial.The heart of a sentence is its topic.If the sentence's topic is incorrect, it will seem quite rambling.2. Pay attention to the L. Yang collocation of words.The English translation does not want to be Chinese; even if the words are not matched correctly, they can still be understood in the wrong order.However, English is different.In English, You need to be aware of how adjectives and nouns, adverbs and verbs, and other combinations are used together.Direct quotations in some texts are related to value judgments.Translators must follow the principle of neutrality, communicate faithfully, and use semantic translation.However, some texts question this point.The contradiction between the randomness of the spoken language and the logic of the written language is that when the spoken language enters the written language, specific logical adjustments are made to the original spoken language according to the solid analytical characteristics of the written language [15].Unknown semantic language in spoken language often requires translators to make appropriate adjustments based on the original text.This can also make it easy for readers to understand.Of course, all the above adjustments must be premised on not changing the intrinsic meaning and value judgment of the original text.Then the variance scoring model is defined as:

Three common feature selection algorithms
From the variance model 1, it can be seen that the larger the Hs® value of a feature, the more sufficient the amount of information contained in the quality.Different types of samples are distinguished by the difference in the information contained in this feature point of other data sample points.That is, the more significant the difference between different data sample points in this feature value, the easier it is for this feature to distinguish other sample points.Therefore, the variance scoring is to evaluate each feature point by formula 1, select those feature points with high scores and rich information, and discard those with low scores and less information.

2) Laplace score
Laplace scoring is to increase the similarity constraint between data on the variance scoring model; that is, the feature points with the same category have similar spatial distributions, and the scoring model is expressed as: From the formula, we can see that for a good feature, the Laplace score Ls ® should be smaller.By scoring each feature value, the features with lower scores are selected to form a new feature subset, thereby reducing dimensionality [16].

3) Sparse scoring
In sparse representation, a data sample point is linearly reconstructed by a few other data points under an over-complete dictionary to obtain a more concise data representation.The reconstructed coefficient matrix thus obtained replaces the similarity matrix in the Laplace score to get a sparse scoring model, which can fully express the local topological structure information between the data.If these sample points come from the same subspace, there is a high similarity and correlation between them, which can play a more significant part in the reconstruction method.On the contrary, if they come from different subspaces, the similarity correlation between the sample points is weak, so they play a small role in the reconstruction process [17].Therefore, the linear reconstruction coefficients of the sample points exhibit sparsity.Thus, in the process of reconstructing the coefficients of the sample, the sparsity constraint is added to the coefficients, and the resulting model is as follows: Where n represents the total number of sample points.
For the formula, we consider the influence of noise, so the above equation constraint model is relaxed into the following form: . min (4) Therefore, the similarity between the two sample points is: After understanding the similarity between the two collected samples, we can better carry out a comprehensive screening of their characteristics.

Feature selection of english semantic translation based on graph regular low-rank score 1) Basic concepts of graphs
A graph is a data structure composed of a collection of vertices and a collection of relations between vertices, which can be represented by the symbol Graph=(V, E).V is the set of vertices, and E is the edges between vertices.
If the edges of any two vertex times shown in Figure 3 are undirected, then the graph is called an undirected graph [18].In image processing, the so-called graph usually refers to an undirected graph.In processing, the corresponding graph is generally calculated under manifold assumption.The manifold hypothesis means that samples in a small neighborhood have similar properties.
The low-rank representation model uses its data as a dictionary to learn the lowest-rank coefficient matrix, which has a robust global description and anti-interference abilities.However, many studies in the field of manifold learning have shown that the local structural information of the data also plays a significant role in accurately expressing the essential attributes of the data.Manifold wisdom refers to Manifold learning is the process of discovering the low-dimensional manifold within the high-dimensional space and then finding the corresponding embedding mapping to accomplish dimensionality reduction or data visualization under the assumption that the data was uniformly sampled in the low-dimensional manifold.The goal is to get to the heart of things by analyzing events and discovering the underlying rules that produce information.The graph-based algorithm can reflect the local structure data of the high-dimensional sample space very well, therefore, selecting discriminative translation methods conducive to clustering and classification from the massive English semantic data.In this part, a new data representation model is constructed by combining the original LRR model and the graph regularization item reflecting the local similarity structure of the data that is called the graph regular low-rank representation [19].After that, the coefficient matrix obtained by the solution model is used to construct the graph weight matrix, and a brand-new scoring method is accepted for feature selection in English semantic translation.This is called a graph regular low-rank scoring algorithm, and the specific process is shown in Figure 4.

2) Constructing the regular term of manifold
The geometric structure information of the data plays a vital role in the discrimination of information.To maintain the local geometric structure between samples in the neighboring space, according to the principle of various hypotheses, the two sample points, xp, and xq, in the high-dimensional data space are adjacent points of each other.Then the coefficients of their corresponding low-dimensional space indicate that zp and zq also have a neighbor relationship.Therefore, manifold learning still maintains the geometric topological structure of the high-dimensional space after dimensionality reduction, thus simplifying the operation [20].Here we use the more popular Gaussian kernel function to express, namely: Based on the manifold hypothesis, high-dimensional data is embedded in a low-dimensional manifold.When two samples are distributed in a small local neighborhood in the low-dimensional manifold, they are assigned the same category.To achieve this, a reasonable way is to minimize the following functions: Where Tr represents the trace of the matrix, and D is a diagonal matrix.The significance of Equation 8 is to use the weight between sample points to reflect their distance in the accordingly low-dimensional space.That is, when the weight between them is more significant, the length in the accordingly low-dimensional space is closed.On the contrary, when the weight between them is small, the distance in the accordingly low-dimensional space is far [21].According to the low-rank representation model and Equation 8, the objective function obtained is as follows: ( )

3) Model solution
To make each variable in the objective function easy to separate during the alternate update process, a new auxiliary variable, J, is first introduced into the model, and the model becomes: The structure of the model is shown in Figure 5:

L. Yang
And it is solved by minimizing the following augmented Lagrangian function: Derivatives can be obtained: By fixing two variables, the parameters J, Z, and E can be updated alternately, and then the parameters Y and U can be edited.The above problems can be divided into the following sub-problems.
The updated graph regularization model is shown in Figure 6:

Semantic representation and graph regularization entity link experiment 4.1 Entity link system structure
According to the characteristics of the task, the entity link system is primarily split into two modules, namely candidate entity generation and candidate entity disambiguation.As shown in Figure 7, it is a schematic diagram of a microblog entity linking system [22].

1) Difficulties of entity linking
Due to the diversity and complexity of named entities, entity linking faces various problems.For example, the ambiguity of the entity reference, the referential variety of the entity, etc.

1)The ambiguity of entity reference
The ambiguity of entity reference generally means that an entity reference has multiple meanings, and it is impossible to determine which entity the reference refers to only from the surface form of the entity reference.The phenomenon of duplicate names is one of the most representative ambiguity problems of entity referents.As shown in Figure 8, the entity refers to "Zhang San," with 3 different persons corresponding to it.If there is no further effective information, it is difficult for us to judge what it specifically refers to.In addition, place names and organization names also have the problem of entity ambiguity [23].

2) Referential diversity of entities
Entity referent diversity generally means that a named entity in the knowledge base often has many entity referents.If an entity reference not covered in the knowledge base is used in the background document, it will be challenging to link the connection to the corresponding named entity.For example, American basketball star Stephen Curry (Figure 9) has as many as 8 physical references (including nicknames, nicknames, etc.), and there will be new references [24].
Stephen Curry Cute god Primary school student

3) The deep semantic relationship model
To calculate the relevance of entities in terms of local consistency, this paper advocates learning latent semantic entity representations, which can reflect the latent semantics of entities.The difference is that when we construct the feature vector layer of the model, we comprehensively use the four types of information in the knowledge base to represent each entity.They are related entities, entity relationships, entity types, entity descriptions, etc. Above the word hashing layer, we set up multiple hidden layers to perform the non-linear mapping.Concerning the objective function designed for entity relations, the deep neural network can learn useful semantic features using the back-propagation algorithm [25].

4) Deep semantic relationship model training
To train a deep semantic relationship model that can obtain entity semantics sensitive to entity relationships, we first automatically extract training data based on the knowledge base and Wikipedia annotations.In addition to using the linked entity pairs in the knowledge base as positive training samples, we will also draw more training samples from Wikipedia, significantly negative training samples.In training the model, we use the highest likelihood estimation strategy to evaluate the model parameters to maximize the probability of the occurrence of positive training samples and minimize the loss function.

Description of experimental data set
The data set selected in this experiment are all Chinese Weibo data sets, which were provided by the entity link tasks of the Natural Language Processing and Chinese Computing Conference in 2013 and 2014, respectively.All data are given in XML format.There is no correlation between two different data, and the entities between the data sets are also not correlated.In addition, we conducted statistics on the entity link data sets of the Natural Language Processing and Chinese Computing Conference in 2013 and 2014.The 2013 data set consists of an overall of 964 Weibo data, including a total of 1,498 entities.The detailed statistical outputs are depicted in Table 1.Among them, the 2014 data set consists of 1257 Weibo data, including 1402 entities.The detailed statistical outputs are depicted in Table 2. and the accuracy rate of unlogged-in entities.

1) Comparative analysis of overall data accuracy
This article first conducted entity link experiments on the 2013 and 2014 data sets to compare and analyze the accuracy of different methods on the overall data set.
Because the named entities that need to be linked are already given in the data set, follow the usual practice.
The use of accuracy to measure the practical effect of entity-linking strategies on the overall data is of L. Yang reference.The recall rate and F value are not considered here.
The experimental results of each method on the 2013 comprehensive data set are shown in Table 3.Among them, the Best_2013 system is the best score on this data set in the evaluation.
The experimental results of each experiment method on the 2014 comprehensive data set are shown in Table 4.Among them, the Best_2014 system is the best score on this data set in the evaluation.

Translation efficiency analysis 1) English Semantic Translation Analysis
The experimental outcome show that the DNN_EL method has achieved the highest entity link accuracy rate, achieving good results of 89.3% and 88.3% on the 2013 and 2014 data sets, respectively, which is better than the best results on each data set; the accuracy rate of the VSM_EL method is second, with a success rate of 86.7% in the two overall data sets; the worst performer is the Lucene_EL method, with accuracy rates of 65.3% and 64.9%, respectively.Through data analysis, we found that, except for unregistered entities that do not need to perform entity disambiguation and return to NIL directly, the results will reflect the effect of entity disambiguation by various methods.The entity link method based on Lucene only uses query keywords for entity disambiguation, and the experimental results are not particularly ideal; the process depends on the vector space model, further uses the context information of the named entity, and the experimental output is greatly improved; the best is the method based on semantic representation and graph regularization, which incorporates more features and achieves an accuracy of more than 90%.The visual display of the experimental results of each method is shown in Figure 10.
L. Yang  By comparing the two graphs, we can find that among the four methods, DNN-EL has the highest accurate data rate, followed by VSM-EL, Best-2013, and the lowest Lucene-EL.In the 2014 data set, DNN-EL has the highest translation semantic accuracy, and Lucene-EL has the lowest.It can be seen that in the actual use process, it is best to use DNN-EL to perform English semantic translation, which can better ensure the accuracy of our English translation.In the display diagram of the various indicators of the entity link results, we can see that in the 2013 data set, the multiple indicators of DNN-EL, whether in accuracy, recall, or F-value, are among the top three indicators.The accuracy rate of the same Best-EL is also among the best, followed by the accuracy rate of the semantic translation of VSM-EL.The indicators of both are still good, and the worst is Lucene-EL.Among the four methods, all his indicators are of relatively low data, so they are not applicable.In the 2014 data set, the indicators of DNN-EL and Best-2014 are relatively high.The overall English semantic translation accuracy rate is still relatively high, and it is instead used, followed by VSM-EL.The indicators are generally average and are at a reasonably high level.The lowest is Lucene-EL.All three indicators belong to the lowest category, so this article does not use this method.

2) Feature Extraction Analysis
To obtain reliable experimental results, firstly, different recognition algorithms are used to score the feature points of each segment of English semantic translation.Correspondingly the feature point scores and their importance is sorted from low to high, and then the first 600 features with low scores are selected to form a subset of the translation target.Finally, use K-means to perform 20 clustering experiments on the obtained target feature subset and choose the best clustering result; on the other hand, K-means is used directly for clustering experiments on all the original data sets and contrast with the previous method; Finally, NMI and ACC are used as evaluation indicators to determine the performance of each algorithm in the clustering experiment.Figure 12 is the NMI and ACC trend charts obtained by the three scoring methods and the K-means clustering method without feature selection on four standard English semantic translation data sets under their respective parameters.Through comparison, we found that: (1) In the expression data set, for the clustering accuracy rate, the overall LRS score is on the rise.When the feature points are less than 400, his fluctuations are relatively large.When the number of feature points is between 400-600, the overall trend is stable, and the overall clustering accuracy is higher than other algorithms.It is lower than the low-rank scoring algorithm on individual feature points.For normalized mutual information, the LRS score shows more substantial superiority than the other three algorithms; (2) When the number of selected features is less than 300, the two indicators of the LRS score on the data set are significantly better than other scoring methods; when the number of elements is minimal, the clustering algorithm without feature screening is considerably better than different algorithms.

The correct rate of english semantic translation
In English semantic translation, we not only pursue the speed of translation but also ensure the accuracy of the translation.For English translation, the translation of English semantics is the most important thing.Here we compare the traditional English semantic translation and the improved English semantic translation.Compare the translation efficiency, translation speed, and accuracy of the two translation modes.To this end, we design an experiment for comparison by comparing a large number of data sets and testing the stability of their translation; the test results are shown in Figure 13: From Figure 13, it is not difficult to see that the improved English semantic translation is significantly better than the traditional English semantic translation in terms of translation rate and accuracy.The translation accuracy rate of the conventional translation mode is maintained between 80%-85%, the improved semantic translation accuracy rate is maintained at 90-95%, and the accuracy rate is increased by 10%-15%.This translation mode can be well applied in actual translation, with highly high translation accuracy.

Discussion
According to Figure 10, the DNN_EL technique has the greatest entity link accuracy rate, reaching excellent results of 89.3% and 88.3% on the 2013 and 2014 data sets, respectively.The VSM_EL method and Lucene_EL method came in second and third on each data set, respectively.Figure 11 shows a visual depiction of several indications of the knowledge base's connection findings for unregistered entities.DNN-EL has the most correct data rate among the four models, followed by VSM-EL, Best-2013, and Lucene-EL, which have the lowest accuracy rate.DNN-EL and Lucene-EL both have poor translation semantic accuracy in the 2014 data set.Concerning four typical English semantic translation data sets, we examined the NMI and ACC trend charts produced by the three scoring techniques and the K-means clustering approach without feature selection in Figure 12.The total LRS score is improving in the expression data set for the clustering accuracy rate.The variations are quite substantial when the feature points are under 400.The general trend is constant and the overall clustering accuracy is greater than that of other methods when the number of feature points is between 400 and 600.The LRS score exhibits more significant superiority than the other three techniques for normalized mutual information.The two indicators of the LRS score on the data set perform significantly better than other scoring methods when the number of selected features is under 300.When the number of elements is low, the clustering algorithm without feature screening performs significantly better than other algorithms.Figure 13 demonstrates that the improved semantic translation accuracy rate is maintained at 90-95% and the accuracy rate is raised by 10%-15% while the traditional translation accuracy rate is maintained between 80% and 85%.DNN-EL has the highest accurate information rate when compared to the Improved GLR Algorithm, RFNet strategy, It may be difficult for the improved GLR algorithm to resolve syntactic and semantic difficulties clearly, which makes it difficult to extract precise and contextually relevant information.Additionally, the improved GLR algorithm often just evaluates statements in their immediate context, without taking conversation or larger context into account.It is often necessary to capture contextual dependencies during feature extraction for semantic translation, such as anaphora resolution, co-reference, or knowledge of discourse interactions.It may be difficult for the algorithm to In contrast, RFNet was designed mainly for visual activities and may not have any innate language comprehension ability.Additionally, RFNet must analyze visual data in real time, which places demands on the efficiency and availability of processing resources.

Conclusions
This article mainly studies the feature extraction of English semantic translation.Through the construction of graph regularization knowledge, model construction, and the comparison of the three feature extraction methods, comparatively excellent feature extraction methods are compared, and popular regularization terms are constructed to analyze the graph regularization.At the same time, it explores the recognition pattern of the recognition algorithm, makes the most efficient English semantic translation method, and investigates the accuracy of the enhanced English semantic translation method.In the end, it is concluded that the accuracy of the improved English semantic translation is 10%-15% greater than the previous translation.This is excellent data in actual English translation, which can effectively enhance the semantic understanding of English translation.Taking up the difficulties of ambiguous language and words with numerous meanings.In the future, we may be able to use sophisticated natural language processing algorithms and semantic analysis to categorize words and sentences depending on their context.

Figure 1 :
Figure 1: Several types and applications of the English translation.

Figure 2 :
Figure 2: Three different types of text ,x2,•••,xn]∈Rdxn, d represents the feature dimension of the data, n = a total number of sample points, and xi = data sample points of a column vector.
By comparing and analyzing the formulas of three standard feature selection algorithms, we conclude that the coefficient scoring algorithm among the three selection algorithms has better feature selection.This can be well applied to our research topic.So we finally adopted the sparse score selection method for feature Feature Extraction of English Semantic Translation Relying on… Informatica 47 (2023) 103-124 109 screening.
Low rank representation coefficient matrixGraph weight connection matrixGraph regular low-rank scoring Feature subsets of target translation methods

Figure 4 :
Figure 4: Translation feature selection method based on graph regular low-rank scoring.

Feature
Extraction of English Semantic Translation Relying on… Informatica 47 (2023) 103-124 111 After reducing, you can get the following:

Figure 6 :
Figure 6: Schematic diagram of the updated graph regularization model.

Figure 7 :
Figure 7: Schematic diagram of the entity link system.

Figure 9 :
Figure 9: Examples of referential entity diversity

Figure 10 :
Figure 10: Display of the accuracy rate of all data in 3013 and 2014.

Figure 11 :
Figure 11: Display diagram of various indicators of the link results of unregistered entities in the knowledge base.

Figure 12
Figure 12 NMI and ACC charts without feature selection and feature selection.

Figure 13 :
Figure 13: Traditional English semantic translation and improved English semantic translation effectively reflect these contextual subtleties due to its low context sensitivity.

Table 1
Detailed statistics of the 2013 entity link evaluation data set

Table 2
Detailed statistics of the 2014 entity link evaluation data set experimental outcome, this study analyzes the overall accuracy rate, the accuracy rate of logged-in entities,

Table 3 :
Accuracy statistics of the overall data set in 2013

Table 4 :
Accuracy statistics of the overall data set in 2014

Table 5 :
Accuracy statistics of the overall data set after training