My Adventures With RavenDB – Getting Distinct List Items

I decided to play around with using the RavenDB database system. I wanted to see how fast it would take me to get a Raven database up and running. I was very impressed with how easy it was, and for the most part it was just a matter of storing my records and using Linq to query for them.

No Select Manys

The only issue I came across was the fact that the Linq implementation does not implement the SelectMany() method. From discussions, this is due to the linq queries being done against Lucene, and since Lucene data are stored flat it is impossible to look for data inside a list.

The query I was trying to implement dealt with the following two data structures:

    public class LogRecord
        public Guid SessionId { get; set; }
        public IList<LogField> Fields { get; set; }
        public int RecordNumber { get; set; }

    public class LogField
        public string FieldName { get; set; }
        public string StringValue { get; set; }
        public DateTime? DateValue { get; set; }

What I needed to do was to retrieve a distinct list of all FieldName values for a given SessionId value. Normally with Linq I would use the following code:

return ravenSession.Query<LogRecord>()
		.Where(x => x.SessionId == sessionId)
		.SelectMany(x => x.Fields)
		.Select(x => x.FieldName)

This fails because SelectMany() is not supported by Raven.

After doing some research it turns out that I needed to use Raven Map/Reduce indexes. The reason Raven indexes work is because they run prior to Raven putting the data into Lucene, thus it can run the SelectMany() on the objects itself rather than just the data stored in Lucene.

So in order to do this I coded the following index:

 public class LogRecord_LogFieldNamesIndex : AbstractIndexCreationTask<LogRecord, LogSessionFieldNames>
        public LogRecord_LogFieldNamesIndex()
            Map = records => from record in records
                             from field in record.Fields
                             select new
                                 SessionId = record.SessionId,
                                 FieldName = field.FieldName

            Reduce = results => from result in results
                                group result by new { result.SessionId, result.FieldName } into g
                                select new
                                    SessionId = g.Key.SessionId,
                                    FieldName = g.Key.FieldName

The query I used to access the list of field names now became:

return session.Query<LogSessionFieldNames, LogRecord_LogFieldNamesIndex>()
			  .Where(x => x.SessionId == sessionId)
			  .Select(x => x.FieldName)
			  .Customize(x => x.WaitForNonStaleResultsAsOfNow())

Hope this helps someone!


2 responses to “My Adventures With RavenDB – Getting Distinct List Items

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s