New project makes Wikipedia data more accessible to AI
On Wednesday, Wikimedia Deutschland announced a new database that will make Wikipedia’s wealth of knowledge more accessible to AI models. Called the Wikidata Embedding Project, the system applies a vector-based semantic search — a technique that helps computers understand the meaning and relationships between words — to the existing data on Wikipedia and its sister platforms, consisting of nearly 120 million entries. Combined with new support for the Model Context Protocol (MCP), a standard that helps AI systems communicate with data sources, the project makes the data more accessible to natural language queries from LLMs. The project was undertaken by Wikimedia’s German branch in collaboration with the neural search company Jina.AI and DataStax, a real-time…