Artículos Científicos

Mayasoundex: A Phonetically Grounded Algorithm for Information Retrieval in the Maya Language

  • Alejandro Molina-Villegas

Abstract

This paper introduces Mayasoundex, a phonet- ically grounded algorithm tailored for infor- mation retrieval in the Maya language. Maya- soundex utilizes phonetic principles to generate consistent codes for words with similar sounds, promoting phonetic similarity in information retrieval tasks. Drawing upon the distinctive phonological characteristics of the Maya lan- guage, the algorithm offers a robust approach to indexing and searching linguistic data....


Incorporating Natural Language Processing models in Mexico City's 311 Locatel

  • Alejandro Molina-Villegas, Edwin Aldana-Bobadilla, Oscar Siordia, Jorge Luis Perez Hernández

Las tecnologías basadas en Procesamiento de Lenguaje Natural (PLN) están transformando diversos sectores al facilitar nuevas formas de prestar servicios mediante Inteligencia Artificial (IA). En este artículo, presentamos los desafíos encontrados y los métodos utilizados durante la generación de un modelo de clasificación de solicitudes de atención ciudadana basado en Deep Learning. La IA es capaz de reconocer eficazmente entre 48...


The Geopolitical Repercussions of US Anti-immigrant Rhetoric on Mexican Online Speech About Migration: A Transdisciplinary Approach

  • Cattin, T., Molina-Villegas, Alejandro, Fuentes-Carrera, J., Siordia, O.S.

This paper presents an ongoing research project that aims to propose a geopolitical analysis of anti-immigrant speech published on the Mexican twitosphere. While Mexico has long defined itself as an emigration country, the apparent grow- ing presence of anti-immigrant discourse online, especially at the Mexican borders, invites us to question the impact of Americans’ anti-immigrant speech, bolstered by Donald Trump’s election and presidency, on Mexicans’ representations. We...


Automation of Topic Generation in Government Information Requests in Mexico

  • Cruz-Pérez, Hermelando, Molina-Villegas, Alejandro, Aldana-Bobadilla, Edwin

In Mexico, legislation guarantees public access to information, empowering citizens to request data from the government. This research delves into the National Transparency Platform's extensive archive, which includes over 2 million requests for information, with the goal of discerning the primary interests of citizens in government actions from 2003 to 2020. Through the analysis of 2,518,875 requests, Genetic Algorithms were employed to fine-tune three crucial hyperparameters of the...


La incidencia de las voces misóginas sobre el espacio digital en México

  • Molina-Villegas, A. (2022)

El advenimiento de las plataformas digitales a principios de siglo sacudió a la sociedad cambiando por completo los mecanismos de comunicación humana. Como consecuencia, han Examensurgido varios fenómenos entre los que destaca el uso de lenguaje violento en el espacio digital. En este capítulo se hace un análisis de la incidencia espacio-temporal de las voces misóginas en Twitter durante el periodo de septiembre 2017 a octubre 2018. También se...


A novel data reduction method based on information theory and the Eclectic Genetic Algorithm

  • Aldana-Bobadilla, E., Lopez-Arevalo, I., & Molina-Villegas, A.

A common task in data analysis is to find the appropriate data sample whose properties allow us to infer the parameters and behavior of the data population. In data mining this task makes sense since usually the population is significantly huge, and thus it is required (for practical reasons) to obtain a subset that preserves its properties. In this regard, statistics offers some sampling techniques usually based on asymptotic results from the Central Limit Theorem. The effectiveness of such...


A content spectral-based text representation

  • Crespo-Sanchez, M., Lopez-Arevalo, I., Aldana-Bobadilla, E., & Molina-Villegas, A. (2022)

Abstract. In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic...


A language model for misogyny detection in Latin American Spanish driven by multisource feature extraction and transformers

  • Aldana-Bobadilla, E., Molina-Villegas, A., Montelongo-Padilla, Y., Lopez-Arevalo, I., & S. Sordia, O. (2021)

Creating effective mechanisms to detect misogyny online automatically represents significant scientific and technological challenges. The complexity of recognizing misogyny through computer models lies in the fact that it is a subtle type of violence, it is not always explicitly aggressive, and it can even hide behind seemingly flattering words, jokes, parodies, and other expressions. Currently, it is even difficult to have an exact figure for the rate of misogynistic comments online because,...


A memory-efficient encoding method for processing mixed-type data on machine learning

  • Lopez-Arevalo, I., Aldana-Bobadilla, E., Molina-Villegas, A., Galeana-Zapién, H., Muñiz-Sanchez, V., & Gausin-Valle, S. (2020)

The most common machine-learning methods solve supervised and unsupervised problems based on datasets where the problem’s features belong to a numerical space. However, many problems often include data where numerical and categorical data coexist, which represents a challenge to manage them. To transform categorical data into a numeric form, preprocessing tasks are compulsory. Methods such as one-hot and feature-hashing have been the most widely used encoding approaches at the expense...


Geographic Named Entity Recognition and Disambiguation in Mexican News using Word Embeddings

  • Molina-Villegas, A., Muñiz-Sanchez, V., Arreola-Trapala, J., & Alcántara, F.

In recent years, dense word embeddings for text representation have been widely used since they can model complex semantic and morphological characteristics of language, such as meaning in specific contexts and applications. Contrary to sparse representations, such as one-hot encoding or frequencies, word embeddings provide computational advantages and improvements on the results in many natural language processing tasks, similar to the automatic extraction of geospatial information. Computer...


Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text

  • Edwin Aldana-Bobadilla, Alejandro Molina-Villegas , Ivan Lopez-Arevalo , Shanel Reyes-Palacios , Victor Muñiz-Sanchez , Jean Arreola-Trapala

The automatic extraction of geospatial information is an important aspect of data mining. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym resolution. The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic...


What do we talk about when we talk about milpa? A conceptual approach to the significance, topics of research and impact of the mayan milpa system

  • Karla Rodríguez, María Elena Méndez-López, Alejandro Molina-Villegas, Lilián Juárez

The Maya milpa is an ancient agrarian system extended along Mesoamerica, widely known for the slash and burn
practices and the polyculture of maize, beans and squashes. Although the Maya milpa has more than three centuries
of history, there are several challenges that must be tackled in order to promote the preservation of the system. The
aim of this paper is to settle a new starting point in Maya milpa research through a synthesis around the analysis Maya
milpa...


Geographical aggregation of microblog posts for LDA topic modeling

  • Pablo López-Ramírez, Alejandro Molina-Villegas, Oscar S. Siordia

In this paper we propose an aggregation strategy for geolocated Twitter posts based on a hierarchical definition of the regular activity patterns within a specific region. The aggregation yields a series of documents that are used to train a topic model. The resulting model is tested against the ones produced by two other aggregation strategies proposed in the literature: aggregation by user and by hashtag. For comparison, we use quality metrics widely used on the literature. The results show...


Active learning in annotating micro-blogs dealing with e-reputation

  • Jean-Valère Cossu, Alejandro Molina-Villegas, Mariana Tello-Signoret

Elections unleash strong political views on Twitter, but what do people really think about politics ? Opinion and trend mining on micro blogs dealing with politics has recently attracted researchers in several fields including Information Retrieval and Machine Learning (ML). Since the performance of ML and Natural Language Processing (NLP) approaches are limited by the amount and quality of data available, one promising alternative for some tasks is the automatic propagation of expert...


Recuperación, procesamiento y clasificación de tuits para visualizar estructuras de interacción

  • Carlos Pérez, Jorge Cortés, Aaron Ramirez, Rocía Abascal-Mena, Alejandro Molina-Villegas

En un contexto de medios sociales digitales, donde existen múltiples formas de vinculación entre usuarios, resulta importante contar con herramientas que permitan analizar los procesos de interacción presentes en estas plataformas. El análisis de redes sociales utiliza frecuentemente diagramas nodo-enlace para representar las relaciones entre un conjunto de actores. Sin embargo, la representación visual de grafos con información adicional en...


El Test de Turing para la evaluación de resumen automático de texto

  • Alejandro Molina-Villegas, Juan Torres

Actualmente existen varios métodos para producir resúmenes de texto de manera automática, pero la evaluación de los mismos continua siendo un tema desafiante. En este artículo estudiamos la evaluación de la calidad de resúmenes producidos de manera automática mediante un método de compresión de frases. Abordamos la problemática que supone el uso de métricas automáticas como ROUGE, las cuales no toman en...


Evolutionary approach for detection of buried remains using hyperspectral images

  • León Dozal, José L. Silván Cárdenas, Daniela Moctezuma, Oscar S. Siordia, and Enrique Naredo.

Programming technique called Brain Programming (BP) for automating the design of Hyperspectral Visual Attention Models (H-VAM.), which is proposed as a new method for the detection of buried remains. Four graves were simulated and monitored during six months by taking in situ spectral measurements of the ground. Two experiments were implemented using Kappa and weighted Kappa coefficients as classification accuracy measures for guiding the BP search of the...


Medic-Us an Intelligent Social Network for Medical Services

  • Gandhi Hernández Cham, Oscar S. Siordia, Alejandro Molina-Villegas, Mario Chirinos-Colunga, y Jorge Canto-Esquivel

Health services are on the top priorities for society, but up to now we have fail in make it universal all around the world. Nowadays information technologies, specially social networks have demonstrated its usefulness in different areas. This article describes the design and development of a social network platform focused on the physician-patient and physician-physician interactions, in order to achieve better and faster diagnosis. Like other social networks or social media tools, it focus...


Regular Activity Patterns in Spatio-Temporal Events Databases: Multi-Scale Extraction of Geolocated Tweets

  • Pablo López-Ramírez, Alejandro Molina-Villegas, Oscar S. Siordia, Mario Chirinos and Gandhi Hernandez Chan

This paper proposes a new technique for the extraction of regular activity patterns at different scales (resolution levels), mined from the microblogging platform Twitter. The approach is based on the recursive application of the DBSCAN clustering algorithm to the geolocated Twitter feed. The proposed technique includes a novel way to obtain ’averaged’ regular activity zones based on the rasterization and aggregation of the Concave Hull of the clusters identified at each...


A simple approach to multilingual polarity classification in twitter

  • Eric S. Tellez, Sabino Miranda-Jiménez, Mario Graff, Daniela Moctezuma, Ranyart R. Suárez, and Oscar S. Siordia

Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a...


A case study of spanish text transformations for twitter sentiment analysis

  • Eric S. Tellez, Sabino Miranda-Jiménez, Mario Graff, Daniela Moctezuma, Oscar S. Siordia, and Elio A. Villaseñor.

Sentiment analysis is a text mining task that determines the polarity of a given text, i.e., its positiveness or negativeness. Recently, it has received a lot of attention given the interest in opinion mining in micro-blogging platforms. These new forms of textual expressions present new challenges to analyze text because of the use of slang, orthographic and grammatical errors, among others. Along with these challenges, a practical sentiment classifier should be able to handle efficiently...


Who participates in conservation initiatives? Case studies in six rural communities in Mexico

  • Méndez-López, M.E., E. García-Frapolli, I. Ruiz-Mallén, L. Porter-Bolland, C. Sánchez- González y V. Reyes-García

Previous studies attempting to explain the factors that determine local participation in conservation initiatives have concluded that socio-political exclusion is the main barrier to being involved in such initiatives. Such studies have not differentiated between different types of conservation initiatives. In this paper, we contribute to the literature analyzing the socio-cultural correlates of participation, by differentiating between participation in three types of conservation schemes:...


Determinants of livelihood diversification: The case wildlife tourism in four coastal communities in Oaxaca, Mexico

  • Véronique SophieAvila-Foucat, Karla Juliana Rodríguez-Robayo

Diversification is a process by which households increase the number of economic activities in different sectors to improve their well-being and chance of survival. The aim of this research is to study the determinants of livelihood diversification with a specific emphasis on wildlife watching in the coastal communities of Oaxaca, Mexico. Based on household surveys, two econometric models were used to examine the differences regarding the asset determinants for those households increasing the...


Preserve and produce: Experience in implementing payments for environmental services in two indigenous communities in the northern and southern ranges

  • Rodríguez-Robayo, K., Merino-Pérez, L. Journal of Sustainable Forestry, 2018.

Payments for environmental services (PES) are conservation instruments in place in various Latin American countries. They are generally undergoing adjustment and implementation changes, and they are widely implemented in indigenous communities. This article aims to suggest a relevant group of context variables in PES implementation. Characterizing the local context of two indigenous communities located in Oaxaca, Mexico, and analyzing the relationship between the local context and PES...


Variational phase recovering without phase unwrapping in phase-shifting interferometry

  • R. Legarda-Sáenz, A. Téllez-Quiñones, C. Brito-Loeza, and A. Espinosa-Romero.

We present a variational method for recovering the phase term from the information obtained from phase-shifting methods. First we introduce the new method based on a variational approach and then describe the numerical solution of the proposed cost function, which results in a simple algorithm. Numerical experiments with both synthetic and real fringe patterns show the accuracy and simplicity of the resulting algorithm.


Dual-plane slightly off-axis digital holography based on a single cube beam splitter

  • M. León-Rodríguez, J.A. Rayas, R.R. Cordero, A. Martínez-García, A. Martínez-Gonzalez, A. Téllez-Quiñones, P. Yañez-Contreras, and O. Medina-Cázares

In order to recover the holographic object information, a method based on the recording of two digital holograms, not only at different planes but also in a slightly off-axis scheme, is presented. By introducing a π-phase shift in the reference wave, the zero-order diffracted term and the twin image are removed in the frequency domain during the processing of the recorded holograms. We show that the zero-order elimination by the phase-shifted holograms is better than working with...


Comparison of multihardware parallel implementations for a phase unwrapping algorithm

  • Francisco Javier Hernandez-Lopez, Mariano Rivera, Adan Salazar-Garibay, Ricardo Legarda-Sáenz

Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase...


Built-up index methods and their applications for urban extraction from Sentinel 2A satellite data: discussion

  • J. C. Valdiviezo-Navarro, A. Téllez-Quiñones, A. Salazar-Garibay, A. López-Caloca.

Several built-up indices have been proposed in the literature in order to extract the urban sprawl from satellite data. Given their relative simplicity and easy implementation, such methods have been widely adopted for urban growth monitoring. Previous research has shown that built-up indices are sensitive to different factors related to image resolution, seasonality, and study area location. Also, most of them confuse urban surfaces with bare soil and barren land covers. By gathering the...


The retarded potential of a non-homogeneous wave equation: introductory analysis through Green functions

  • A. Téllez-Quiñones, J. C. Valdiviezo-Navarro, A. Salazar-Garibay, A. López-Caloca

The retarded potential, a solution of the non-homogeneous wave equation, is a subject of particular interest in many physics and engineering applications. Examples of such applications may be the problem of solving the wave equation involved in the emission and reception of a signal in a synthetic aperture radar (SAR), scattering and backscattering, and general electrodynamics for media free of magnetic charges. However, the construction of this potential solution is based on the theory of...