Presenting WikiDataDotNet – Client API for WikiData

WikiData

WikiData is one of those things that sets the mind boggling at the possibilities of the internet. It’s a project, started by the WikiMedia foundation, to collect structured data on everything. If you are doing anything related to machine learning, it is the best source of data I have so far found.
It aims to contain an items on everything and for each item a collection of statements describing aspects of it and it’s relationship to other items. Everything makes more sense with an example, here is it’s record on the item Italy which can be found in the API like so:
This will return a JSON file with sections like:
       "id": "Q38",
"labels": {
"en": {
"language": "en",
"value": "Italy"
},
Here we see the id of the item, in this case Q38 that is used for looking Italy up. Then labels contains the name of Italy in each language. Further down there is also a section aliases that contains alternate names for Italy in every language.
Futher down we get to the really interesting stuff, claims.
          "P36": [  
{
"mainsnak": {
"snaktype": "value",
"property": "P36",
"datavalue": {
"value": {
"entity-type": "item",
"numeric-id": 220
},
"type": "wikibase-entityid"
},
"datatype": "wikibase-item"
},
"type": "statement",
"qualifiers": {
"P580": [
These are a series of statements about the different aspects of the item. For example the above P36 is a claim about what the capital of Italy is. Claims are also entities in the API, so they can also be looked up like so https://www.wikidata.org/w/api.php?action=wbgetentities&ids=P36
mainsnak is the main statement associated with this claim (a Snak in wikidata is any basic assertion that can be made about an item). These all have a value and a type. In this case the claim that about Italy’s capital, the value is a reference to a wiki entry, which can again be looked up from WikiData if you append a Q to the beginning of the numeric id, you my have already worked out what the entity here is https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q220
Other claims on Italy include location, who it shares a border with, public holidays, provinces, basic form of government, head of state, population(across history), head of government, the list is endless(no wait, actually it’s 64 entries long).

Presenting WikiDataDotNet

I’ve been working on a project that needed to query against WikiData from .Net. The only existing .Net API for this I could find is Wikibase.NET for writing wiki bots. It hasn’t been updated in a while and unfortunately a quick test reveals it no longer works. At a future date I may fix it up, but in the meantime I’ve created this quick query only API: WikiDataDotNet

Usage

It currently provides the ability to request entities:
F#
 let italy = WikiDataDotNet.Request.request_entity "Q38"   
C#
 var italy = WikiDataDotNet.Request.request_entity("Q38");  
and do a text search against wiki data:
F#
 let search_result = WikiDataDotNet.Request.search "Headquarters of the U.N"  
C#
 var searchResult = WikiDataDotNet.Request.search("en", "Headquarters of the U.N");  
That’s it for functionality so far. My next plans are to make it easier to look up Claims against items and do caching of Claims. Also maybe some kind of LINQ style querying interface would be nice.

Programming a programming computer game – .Net run time type creator

A while ago me and a friend had an idea for a computer game. You would control a collection of bacteria all which needed to feed and would die of old age given enough time. They could also reproduce in order of keep their population going and fight enemy bacteria population controlled by an other player. The aim of the game was from your bacteria to out compete the other players bacteria on the map. So far so unoriginal, our new idea was that rather than controlling the creatures through say using a mouse and keyboard to give them orders you would instead controls them by writing the code for how they behaved. It would be a real time competitive programming game.

The game interface would be a map with a text panel on the right where the user would enter code that the creatures would execute to make their decisions. There were commands for where to move, what to eat, when to breed, etc. This was also a really nice space from which to play with algorithms like neural nets, evolutionary algorithms, clustering, A*, etc. We played around with it a bit and had a fair amount of fun, but we eventually realized that even for us who had built it, the game was too complicated for anyone to actually play. At least not in real time. So we abandoned it as a fun experiment.

But I recently saw this post on stack overflow that reminded me of that game. So I thought I would share some of the code for how to do in application code compilation in .Net. Hopefully it will be of use to some people and maybe even if I get enough interest I may try and clean up the rest of the code and release it as an open source project. Because despite being painfully complicated, when it did work it was fun, at least for uber nerds like us.

RunTimeTypeCreator

Here is the one and only method in the lib method:
 public static T CreateType(string source,   
IEnumerable assemblies,
out List compilationErrors)
where T : class
It will attempt to create an instance of the type T from the source passed in. The source will be compiled with references to all the assemblies in the assemblies parameter. So for example you could do this with it.
 namespace RunTimeTypeCreator.Tests   
{
public interface ITestType
{
bool Success { get; }
}
public class RunTimeTypeCreatorTests
{
public static bool TestVariable = false;
public void Example()
{
const string source = @"
using RunTimeTypeCreator.Tests;
public class TestTypeClass : RunTimeTypeCreatorTests.ITestType
{
public bool Success { get { return RunTimeTypeCreatorTests.TestVariable; } }
}";
List compilationErrors;
var type = RunTimeTypeCreator.CreateType(source,
new[] { "RunTimeTypeCreator.Tests.dll" }, //the name of this assembly
out compilationErrors);
TestVariable = false;
//will print true
Console.WriteLine(type.Success)
//will print false
TestVariable = true;
Console.WriteLine(type.Success)
}
}
}
Which is kind of cool I think. Here’s a quick run through of how it works, this just shows the code minus bits of validation and error reporting, so if you want the full thing I would recommend getting it from github

 var csc = new CSharpCodeProvider();   

var parameters = new CompilerParameters
{
//we don't want a physical executable in this case
GenerateExecutable = false,
//this is what allows it to access variables in our domain
GenerateInMemory = true
};

//add all the assmeblies we care about
foreach (var assembly in assemblies)
parameters.ReferencedAssemblies.Add(assembly);

//compile away, will load the class into memory
var result = csc.CompileAssemblyFromSource(parameters, source);

//we compiled succesfully so now just use reflection to get the type we want
var types = result.CompiledAssembly.GetTypes()
.Where(x => typeof(T).IsAssignableFrom(x)).ToList();

//create the type and return
return (T)Activator.CreateInstance(types.First());
I also had some other code around validating what the user was doing, Making sure they weren’t trying to access the file system, open ports or creating memory leaks/recursive loops. I’ll try and clean this up and post it at a future date.

Full code is available here on github.