Sitecore Dictionary gotcha when using the master database

During some projects at work, we were having a real weird problem with the Sitecore Dictionary feature. This is the built in feature in Sitecore that lets you localize short texts, such as what to put in the "Read more" or "Next page" links on a web page. Of course, there are many ways to do this, but since Dictionary items is a supported Sitecore feature, and they can be edited from within the Sitecore environment, it seems reasonable to use this for Sitecore sites.

This is how it works: You put dictionary items under /sitecore/system/Dictionary. Each dictionary item consists of a shared key and a localized phrase field. That is, the key is the same for all languages, the phrase varies by language. To get a phrase at runtime, you use the static Translate.Text method and pass the key as a parameter, and it will return the correct phrase in the current context language (or you can explicitly pass a Language). This is also described on SDN here (you will need an SDN account to access the article). There even is an XSLT extension function for the Translate methods. Very nice.

The problem we were facing started to occur when we moved some of our front-end developers (who does most of the localization texts) away from working on a test-server, and instead having their own development version on their own workstation, working there (the way it really should be done - also gives you a better chance of running CI). The problem was, that on one developer's workstation you would see one set of Dictionary items on the website, on another workstation there would be a different set. Sometimes all dictionary phrases was completely missing. Inside the Sitecore Content Editor, the data seemed to be correct.

So I started digging around to find an explanation. I started using Reflector to determine what was really going on in the Sitecore.Globalization.Translate class. I quickly found out, that the class internally keeps a Hashtable containing the languages, and in that a Hashtable for each language's texts, in memory. Great for quick lookups. So when and how is the Hashtable filled with data ? This is where it gets tricky. When a phrase is first requested (after a application restart), Sitecore will look for a file called dictionary.dat in the temp directory and try to load it. This is where the Translate class keeps it's persistent cache: Each time a key is added, it will get saved to the file (in addition to being kept in memory); and it will try to load it from there when a phrase is requested, and there is no dictionary data in memory. If reading this file fails, it will rebuild it from the data in the database. Here lies the first problem: You might know that Sitecore operates with a core database (for Sitecore itself), a master database for content and a web database that content is served from. As it turns out, the dictionary will only ever be re-populated with data from the core database. This is hard-coded in the class. And we were adding all our Dictionary items to the master database (since this is really where they should be, it is content). This was my first "that's funny" moment - If the dictionary cache only rebuilds from core; how could our setup ever have worked ??

Time to research some more. After some more tinkering around, I found the Sitecore.Globalization.ItemSaveEventHandler. This is an event handler, that is hooked up to the ItemSaving event in web.config. What this does, is that whenever an item is saved, if it is of type "dictionary item", it will add the key and phrase to the internal language hashtable, which will also trigger a save of the dictionary.dat file. This event handler however, does not care which database is being used. Both master and core (and web for that matter) database saves, will trigger a update of the cache.

This explained everything. If Sitecore is being used on the same machine only, dictionary items in the master database, will work. If it is deployed to production, it will work, because we typically copy the entire website folder, including the temp files, and thus the "correct" dictionary.dat. However, if Sitecore is used in parallel on different servers, you will start seeing the errors we were seing, inconsistent and/or missing dictionary entries. This could be an issue during development if you setup the environment on different developer workstations, but also in staging environments or in load-balanced environments (depending on your setup). If dictionary items (again, in the master database), is saved in parallel on different environments, it will generate different dictionary.dat files with different phrases, and it is impossible to merge. And if you lose dictionary.dat, you can't restore the master dictionary entries without saving each item again.

Revisiting the SDN documentation, though it is quite thin on the subject, it does state that you should add your own dictionary items to the core database. It feels wrong to do this, since our own dictionary items would then be mixed with Sitecore's own internal ones; and because I don't think the core database is a place to store customer data. So I guess it is not a bug that storing dictionary items in the master is not supported, but it would be a nice and reasonable feature. But I do think that there is a bug here; in that it is at all possible to use dictionary items stored in master, when it is not supported. It should definitely work consistently; and not be some half-baked feature that works "occasionally, if you use the right setup". It should also be noted that the Dictionary node is already in the master database in a clean installation, so there are no alarm clocks going off when the novice Sitecore programmer starts using it.

Now, we already have solutions in production using this approach, and time committed on ongoing projects for using it this way. So we needed a way to ensure that dictionary items in the master database would behave consistently and just work.

The solution is below. It is basically a class that can rebuild the Sitecore dictionary from both databases. It reuses the Sitecore logic by invoking their Load and Save methods using reflection, and overwriting the static _languages hashtable field in the Translate class. This is really not pretty, it is a ugly hack, and I would definitely prefer not to use reflection to call into methods that were never intended to be used from outside the class. That being said however, it seems to work - but of course there are no guarantees, and if it blows up or kills your kitten; I'm not responsible.

To use it, simply call the Rebuild method. I used a custom IHttpHandler for the purpose, so I can call the URL whenever needed (don't deploy the handler it into production however ;-) ). After the cache has been rebuilt, you can share the dictionary.dat with other development machines just by copying it, or you can just rebuild when needed at each developer's discretion.

         1: using System;
         2: using System.Reflection;
         3: using Sitecore.Data;
         4: using Sitecore.Configuration;
         5: using System.Collections;
         6: using Sitecore.Data.Items;
         7: using Sitecore.SecurityModel;
         8: 
         9: namespace Webdanmark.SitecoreCMS.Common.Utility
         10: {
         11:     /// <summary>
         12:     /// This class supports rebuilding the Sitecore dictionary from both Core and Master databases.
         13:     /// Default implementation from Sitecore can only rebuild from Core, which leads to various issues if
         14:     /// the temp dictionary.dat file is lost, or editing happens on multiple servers.
         15:     /// </summary>
         16:     /// <remarks>
         17:     /// This class tinkers with Sitecore private methods and internal workings. Not pretty.
         18:     /// This is a hack to workaround a limitation in Sitecore without re-implementing the whole thing.
         19:     /// </remarks>
         20:     public class DictionaryRebuilder
         21:     {
         22:         /// <summary>
         23:         /// Event fired when progress in the task occurs.
         24:         /// </summary>
         25:         public event EventHandler<DictionaryRebuilderEventArgs> Progress;
         26:         /// <summary>
         27:         /// Databases.
         28:         /// </summary>
         29:         private readonly Database[] databases;
         30:         /// <summary>
         31:         /// The Translate type.
         32:         /// </summary>
         33:         private readonly Type translateType;
         34:         /// <summary>
         35:         /// Load method.
         36:         /// </summary>
         37:         private readonly Action<Hashtable, Item> loadMethod;
         38:         /// <summary>
         39:         /// Binding flags for a private static member.
         40:         /// </summary>
         41:         private static readonly BindingFlags privateStatic = BindingFlags.Static | BindingFlags.NonPublic;
         42:         /// <summary>
         43:         /// Save method
         44:         /// </summary>
         45:         private readonly Action saveMethod;
         46:  
         47:         /// <summary>
         48:         /// Initializes a new instance of the <see cref="DictionaryRebuilder"/> class.
         49:         /// </summary>
         50:         public DictionaryRebuilder()
         51:         {
         52:             databases = new[] { Factory.GetDatabase("core"), Factory.GetDatabase("master")};
         53:             translateType = typeof(Sitecore.Globalization.Translate);
         54:             loadMethod = (Action<Hashtable, Item>) FindMethod<Action<Hashtable,Item>>("Load", privateStatic, typeof (Hashtable), typeof (Item));
         55:             saveMethod = (Action) FindMethod<Action>("Save", privateStatic);
         56:         }
         57:  
         58:  
         59:         /// <summary>
         60:         /// Rebuilds the dictionary cache.
         61:         /// </summary>
         62:         public void Rebuild()
         63:         {
         64:             Hashtable rootTable = new Hashtable(10);
         65:             foreach (var db in databases)
         66:             {
         67:                 var langs = db.Languages;
         68:                 SendMessage("\nProcessing {0} database, {1} languages.", db.Name, langs.Length);
         69:                 foreach (var language in langs)
         70:                 {
         71:                     string languageKey = language.ToString();
         72:                     Hashtable languageTable;
         73:                     if (rootTable.ContainsKey(languageKey))
         74:                         languageTable = (Hashtable)rootTable[languageKey];
         75:                     else
         76:                         rootTable[languageKey] = languageTable = new Hashtable();
         77:  
         78:                     RebuildLanguage(db, language, languageTable);
         79:                 }
         80:             }
         81:             SendMessage("\nLanguages loaded.");
         82:             ReplaceSitecoreTable(rootTable);
         83:             SendMessage("Writing data cache to file.");
         84:             saveMethod();
         85:             SendMessage("\nDone.");
         86:         }
         87:  
         88:         /// <summary>
         89:         /// Finds the method.
         90:         /// </summary>
         91:         /// <typeparam name="TDelegate">The type of the delegate.</typeparam>
         92:         /// <param name="name">The name.</param>
         93:         /// <param name="bindingFlags">The binding flags.</param>
         94:         /// <param name="parameterTypes">The parameter types.</param>
         95:         /// <returns></returns>
         96:         private Delegate FindMethod<TDelegate>(string name, BindingFlags bindingFlags, params Type[] parameterTypes)            
         97:         {
         98:             MethodInfo method = translateType.GetMethod(name, bindingFlags, Type.DefaultBinder, parameterTypes, null);
         99:             return Delegate.CreateDelegate(typeof (TDelegate), method);
         100:         }
         101:  
         102:         /// <summary>
         103:         /// Replaces the sitecore table.
         104:         /// </summary>
         105:         /// <param name="hashtable">The hashtable.</param>
         106:         private void ReplaceSitecoreTable(Hashtable hashtable)
         107:         {
         108:             FieldInfo fi = translateType.GetField("_languages", privateStatic);
         109:             fi.SetValue(null,hashtable);
         110:         }
         111:  
         112:         /// <summary>
         113:         /// Rebuilds the language.
         114:         /// </summary>
         115:         /// <param name="db">The db.</param>
         116:         /// <param name="language">The language.</param>
         117:         /// <param name="languageTable">The language table.</param>
         118:         private void RebuildLanguage(Database db, Sitecore.Globalization.Language language, Hashtable languageTable)
         119:         {
         120:             using (new SecurityDisabler())
         121:             {
         122:                 var dictionaryRoot = db.GetItem("/sitecore/system/dictionary", language);
         123:                 if (dictionaryRoot == null)
         124:                 {
         125:                     SendMessage("\tNo dictionary found in {0} for {1}", db.Name, language.Name);
         126:                     return;
         127:                 }
         128:  
         129:                 SendMessage("\tProcessing {0}", language.Name);
         130:                 loadMethod(languageTable, dictionaryRoot);
         131:             }
         132:         }
         133:  
         134:         /// <summary>
         135:         /// Sends the message.
         136:         /// </summary>
         137:         /// <param name="msg">The MSG.</param>
         138:         /// <param name="inserts">The inserts.</param>
         139:         private void SendMessage(string msg, params object [] inserts)
         140:         {
         141:             if (Progress != null)
         142:             {
         143:                 var args = new DictionaryRebuilderEventArgs {Message = String.Format(msg, inserts)};
         144:                 Progress(this, args);
         145:             }
         146:         }
         147:  
         148:         /// <summary>
         149:         /// Event arguments
         150:         /// </summary>
         151:         public class DictionaryRebuilderEventArgs : EventArgs
         152:         {
         153:             /// <summary>
         154:             /// Gets or sets the message.
         155:             /// </summary>
         156:             /// <value>The message.</value>
         157:             public string Message { get; set; }
         158:         }        
         159:     }
         160: }

 And you could use it like this:

        1: DictionaryRebuilder builder = new DictionaryRebuilder();
        2: builder.Progress += (s, e) => Response.WriteLine(e.Message);
        3: builder.Rebuild();

If you want to display progress while rebuilding, hookup the Progress method to some event handler. The one in my example won't compile for you, since Response.WriteLine is an extension method in one of our common libraries.

Disclaimer: The contents of this Blog post is my own opinions only. I am not affiliated with Sitecore in any way. Some of the technical details was extracted using Reflector, and some are educated guesswork. I might be wrong, and Sitecore might very well change the implementation in a later version, so that the information above does no longer apply. This was done on Sitecore 6.0.1. rev 090212, but I suspect that the general idea is the same in previous Sitecore versions.