>
Blog
Book
Portfolio

Taxonomy

When taxonomy was first introduced in SharePoint 2010, I felt as though around three weeks before the product was released in a technical preview, some program manager at Microsoft decided to sneak it in. I mean, it works fine, but the implementation seems a bit kludgy; it contains hidden lists and note fields, hooks curious event receivers, and presents configuration difficulties.

For example, you need to grant permissions in three different places in the Managed Metadata service application to start working with groups and terms programmatically and via the UI. The following picture journey was all captured while logged in as the local/farm/Jesus admin. First, highlight the "Managed Metadata Service" row in the service app list, and click "Administrators" in the ribbon. The following UI pops up:

Viewing the Managed Metadata Service administrators

Viewing the Managed Metadata Service administrators

Add your administrators at the top, give them "Full Control" at the bottom, and then click "OK." Next, follow the same steps in the service app list and the ensuing pop up, but this time click the "Permissions" button in the ribbon. Make sure to grant all permissions. If you have application logic that hits the term store, you'll need to elevate and impersonate whichever account you use here.

Viewing the Managed Metadata Service permissions

Viewing the Managed Metadata Service permissions

Finally, you'll need to give permissions (to yourself and others) in the service app itself. Click on the text "Managed Metadata Service" and you're brought to the Term Store Management Tool. Notice that all the options are grayed out and you can't perform any operations on the "Taxonomy Term Store" tree on the left. You need to make yourself a "Term Store Administrator" here as well. Populate the people picker and click "Save" at the bottom.

Setting the term store administrators

Setting the term store administrators

It'll spin...

Watching the taxonomy grass grow

Watching the taxonomy grass grow

And now we can play!

Completing the taxonomy setup

Completing the taxonomy setup

Ugh. I needed to make my farm admin (the only account on this box) an "administrator" in two different places in the same service application? I repeat: ugh. However, this is of course assuming that the service app is up and running. Periodically, especially when screwing around on the farm, (such as adding or removing front ends) the service may poop out:

Breaking the Managed Metadata service

Breaking the Managed Metadata service

To fix this, return to the Central Administration home page, and under "System Settings" click on "Manage services on server." Scroll down this page, and see if the "Managed Metadata Web Service" has a "Status" of "Stopped" in glaring red text. All you have to do on this page is click "Start" to fire the service back up.

Fixing the Managed Metadata service

Fixing the Managed Metadata service

Finally, do an IISRESET and everything should come back to life. If it doesn't, put down this book, and start Binging. But even more annoying than this configuration stuff, to me, is that taxonomy doesn't seem to play well with its environment. For example: it's first version was curiously absent from the client object model APIs, and incompatible with the datasheet view in lists.

These are the sorts of issues I see resulting from UAT phases of my projects when my developers try to shove in a last minute enhancement. SharePoint 2013 supports the two above features, so that leads me to believe that taxonomy has gotten some much needed love. Beyond that, "day to day" operations and code in the taxonomy API is pretty much what you're used to in 2010.

At first, I wasn't going to include taxonomy in this book at all, since the Managed Metadata Service has an Excel-driven import utility. Why would anyone pay for code when they can use Excel themselves? Well, the issue with this is twofold. First, there's (still) no export utility from which to generate the spreadsheet, so administrations would have to roll that themselves.

But more importantly (especially from a deployment perspective) is the fact that this import utility doesn't consider the guids of the groups, term sets, and terms. So when you use the aforementioned utility, your taxonomy objects will not be "compatible" across your development, staging, and production servers.

This leads to grayed out, broken taxonomy fields after restoring from a site collection backup. By invoking one of the core tenants of All Code deployments, storing guids in the Constants class and provisioning assets with this known unique identifier, we can side-step this issue by baking taxonomy into our deployment scripts.

Dealing with broken taxonomy fields

Dealing with broken taxonomy fields

When I first played with the Term Store Management Tool in 2013, I was encouraged by seeing a "Unique Identifier" section at the bottom of the screen on the right that displayed the guid for each object in the tree. (See the screenshots at the beginning of this section.) Maybe SharePoint now takes better care to maintain these ids across environments? But looking at the "Sample Import" spreadsheet, (click the "View a sample import file" link with the root node in the tree selected) there was no column for guid, so I'm assuming this id maintenance is still on us:

Exporting the term store

Exporting the term store

Building The Terms

No worries; we'll sling some code to create and deploy our taxonomy for us. First, add "Microsoft.SharePoint.Taxonomy.dll" (in 15\ISAPI) to DDD.Dependencies, and then reference it in DDD.Common. Next, we add our taxonomy extension methods to Utilities.cs. I'll show you a few of the methods here; the full source will be available in this book's accompanying code.

Code Listing 80: Utilities.cs

  1. using Tax = Microsoft.SharePoint.Taxonomy;
  2. ...
  3. public static Tax.TermSet EnsureTermSet(this Tax.TaxonomySession session, Tax.Group group, Guid id, string name, Dictionary<Guid, string> terms)
  4. {
  5. //initialization
  6. bool isDirty = false;
  7. List<string> sortOrder = new List<string>();
  8. //get termset
  9. Tax.TermSet termSet = group.TermSets.SingleOrDefault(t => t.Id.Equals(id));
  10. if (termSet == null)
  11. {
  12. //create termset
  13. termSet = group.CreateTermSet(name, id, 1033);
  14. isDirty = true;
  15. }
  16. //add terms
  17. if (terms != null)
  18. {
  19. //create each term
  20. foreach (Guid key in terms.Keys)
  21. {
  22. //make sure term doesn't exist
  23. if (!termSet.Terms.Any(t => t.Id.Equals(key)))
  24. {
  25. //create term
  26. termSet.CreateTerm(terms[key], 1033, key);
  27. sortOrder.Add(key.ToString());
  28. isDirty = true;
  29. }
  30. }
  31. }
  32. //set custom order
  33. if (string.IsNullOrEmpty(termSet.CustomSortOrder))
  34. termSet.CustomSortOrder = string.Join(":", sortOrder.ToArray());
  35. //save
  36. if (isDirty)
  37. session.DefaultKeywordsTermStore.CommitAll();
  38. //return
  39. return termSet;
  40. }

There's a couple things to point out here. First, in Line #1, I alias the Microsoft.SharePoint.Taxonomy namespace because it doesn't use the "SP" prefix before all its classes and collides with other types (and has a "FeatureIds" member just like the publishing DLL). The method signature in Line #3 extends a "TaxonomySession" instance, which is our programmatic access point to the Managed Metadata service.

The TaxonomySession has a "DefaultKeywordsTermStore" object hanging off of it. This is our term store object; think of this as the root node in the left-hand tree view in the Term Store Management Tool. The next level down in the object hierarchy that we can start playing with is the "Group," which is just a logical, well, grouping of Term Sets. Term Sets are themselves a collection of terms, and is the thing that maps to a taxonomy field back over in the site collection.

Finally, at the leaf level is the term, which I elect to represent as a <guid, string> element in a generic dictionary. Terms can have sub terms and sub sub terms and so on. They do have more metadata than just a name and an id, but these are really the only ones I find useful. I leave filling in the descriptions as a task for the administrators.

I chose to display EnsureTermSet because it operates on a lot of these objects. Another thing to point out here is my usage of a "dirty" bit in several places in this method. This is just a best practice I picked up because taxonomy operations could be very slow and not liberally performing saves (or "CommitAlls" in Line #37) can make your code run much more quickly.

Finally, and most importantly, notice that I'm using the All Code "Create if not already there" pattern, which differs from Structure's "create or bomb" paradigm. This is because we can't blow away our term store quite as easily as a site collection because it's more "global" in nature and might disrupt other web apps in your development environment. Now, of course, there's no reason why you couldn't; I just feel like this pattern is easier in this scenario.

Here's a quick listing of the rest of the taxonomy utility methods:

  • EnsureGroup
  • CreateSubTerms
  • CreateTaxonomySiteColumn
  • FixTaxonomySiteColumn
  • RemoveTaxonomySiteColumn

Deploying The Terms

Now let's add the code to our Structure that creates a group, a term set, and some terms. We'll then create a new taxonomy column bound to the term set and add it to our content type. We'll be using CreateTaxonomySiteColumn above for this. This method is very similar to CreateLookupSiteColumn (since taxonomy fields actually inherit from SPFieldLookup) so I won't show it. We'll see the interesting parts of that method anyway when we look at FixTaxonomySiteColumn. But first, our constants:

Code Listing 81: Constants.cs

  1. public static class SiteColumns
  2. {
  3. ...
  4. public class TaxField
  5. {
  6. public const string DisplayName = "Taxonomy Field";
  7. public const string InternalName = "TaxonomyField";
  8. public static readonly Guid Id = new Guid("4AF22FB8-46BB-4FC0-8757-12EB82F63F88");
  9. public static readonly Guid NoteId = new Guid("EB33F5F9-D8D4-49D8-BF95-B4E4306DBF26");
  10. }
  11. }
  12. ...
  13. public static class Taxonomy
  14. {
  15. public const string GroupName = "DDD Group";
  16. public const string TermSetName = "DDD Term Set";
  17. public static readonly Guid GroupId = new Guid("B7005724-F96B-45B6-93FA-BA18714DF59B");
  18. public static readonly Guid TermSetId = new Guid("7FBFA134-CB8D-472F-80E6-F7FFF5F518BE");
  19. public class ParentTerm
  20. {
  21. public const string Name = "Parent Term";
  22. public static readonly Guid Id = new Guid("5FE02E68-F170-4812-ADB8-F73FC737AB74");
  23. public static Dictionary<Guid, string> ChildTerms = new Dictionary<Guid, string>()
  24. {
  25. { new Guid("21B121DD-0FEA-46A0-88FF-612632E715FA"), "Child Term" },
  26. { new Guid("C591DFFB-A9E7-4DF7-90D3-BA46D4F95743"), "Another Child Term" }
  27. };
  28. }
  29. public class AnotherTerm
  30. {
  31. public const string Name = "Another Term";
  32. public static readonly Guid Id = new Guid("3F143828-FB31-4649-BF80-59B244B0DC54");
  33. }
  34. }

First of all, in Line #4, don't forget to add our bits for the new taxonomy field. The only derivation from the standard site column pattern here is Line #9, where we include a constant guid for our note field. SharePoint creates this field for us, but I like to roll my own so that we have its id in Constants; this makes the fix and remove logic easier. Otherwise, we'd have to burn an extra API call to get the note field from the taxonomy field itself.

Next, in the Taxonomy class, we see a generic dictionary on Line #'s 23-27. This is a shortcut to creating a separate class for every single damn term, since, like I said, all I really care about is each one's id and name. Also, CreateSubTerms takes this type in as a parameter to create the corresponding terms that don't already exist for a parent term. However, you can do whatever you'd like here. I've experimented with using interfaces to clean some of this up, but it makes the "Go to definition" command (which I use quite a bit) in Visual Studio essentially useless.

After adding a reference to Microsoft.SharePoint.Taxonomy.DLL to DDD.Web, we are going to add the following code to the beginning of the Structure feature receiver. Why at the beginning? What I've found is that if we mess with taxonomy after calling SPWeb.Update too many times (which we do plenty of in All Code) you get the following exception:

Getting a taxonomy exception

Getting a taxonomy exception

Now I hate the idea of code working in some areas of a file and not others (all logic and variable scoping considered). Probably a better way to do this is to reconstitute an SPWeb so that we're operating against a fresh instance of it. Another area where this issue crops up is when calling SPContentType.Update too many times as well.

Basically any time we're shoving an SPPersistedObject into the database too often, our code will start to throw concurrency errors like the one above. We'll see an example of this "Update an SPPersistedObject from a fresh web" pattern at the end of this section. In general, I use this in second or third versions of Structure code when we need to update existing objects in "Create if not already there" scenarios.

If you want, you can wrap metadata provisioning into its own feature. However, the "flow" of Structure code is to use the objects returned from previous extension methods to seed the next object's provisioning further down the method. In the code below, we're creating a new taxonomy field bound to a term set and adding it to our Rollup Article content type. Keeping such logic in one place makes dependencies easier. However, if you are creating a massive hierarchy of terms and want to keep it separate, go right ahead. Don't kludge your architecture to save five minutes of development time!

Code Listing 82: Structure.EventReceiver.cs

  1. using Tax = Microsoft.SharePoint.Taxonomy;
  2. ...
  3. public override void FeatureActivated(SPFeatureReceiverProperties properties)
  4. {
  5. //get site and web
  6. SPSite site = ((SPSite)properties.Feature.Parent);
  7. SPWeb root = site.RootWeb;
  8. //get taxonomy
  9. Tax.TaxonomySession session = new Tax.TaxonomySession(site);
  10. //ensure group
  11. Tax.Group group = session.EnsureGroup(
  12. Constants.Taxonomy.GroupId,
  13. Constants.Taxonomy.GroupName);
  14. //ensure term set
  15. Tax.TermSet ts = session.EnsureTermSet(
  16. group,
  17. Constants.Taxonomy.TermSetId,
  18. Constants.Taxonomy.TermSetName,
  19. new Dictionary<Guid, string>()
  20. {
  21. { Constants.Taxonomy.ParentTerm.Id, Constants.Taxonomy.ParentTerm.Name },
  22. { Constants.Taxonomy.AnotherTerm.Id, Constants.Taxonomy.AnotherTerm.Name }
  23. });
  24. //build child terms
  25. Tax.Term term = ts.GetTerm(Constants.Taxonomy.ParentTerm.Id);
  26. session.CreateSubTerms(term, Constants.Taxonomy.ParentTerm.ChildTerms);
  27. ...
  28. //create taxonomy column
  29. SPList taxList = root.Lists.TryGetList(Constants.Lists.Taxonomy);
  30. SPField taxField = root.CreateTaxonomySiteColumn(
  31. Constants.SiteColumns.TaxField.Id,
  32. Constants.SiteColumns.TaxField.NoteId,
  33. Constants.SiteColumns.TaxField.DisplayName,
  34. Constants.SiteColumns.TaxField.InternalName,
  35. Constants.SiteColumns.GroupName,
  36. ts,
  37. taxList.ID,
  38. false,
  39. true);
  40. //create content types
  41. SPContentType rollupArticle = root.CreateContentType(
  42. Constants.ContentTypes.RollupArticle.Id,
  43. Constants.ContentTypes.RollupArticle.Name,
  44. Constants.ContentTypes.GroupName,
  45. ContentTypeId.Page,
  46. Constants.SiteColumns.MainContent.Id,
  47. Constants.SiteColumns.Abstract.Id,
  48. Constants.SiteColumns.ThumbnailImage.Id,
  49. Constants.SiteColumns.ExternalLink.Id,
  50. Constants.SiteColumns.RollupDate.Id,
  51. Constants.SiteColumns.TaxField.Id);
  52. ...
  53. }

Starting at Line #8, follow along as we get a session, create a group, (Line #11) provision a term set with two terms, ("Parent Term" and "Another Term") (Line #15) and then add two child terms to "Parent Term" (Line #26). Line #25 is a straight API call. Now that we have a reference to the term set (and the hidden taxonomy list from Line #29) we can create a field mapped to it.

This happens in Line #30. As I said before, a taxonomy field is a lookup field; you just to tell the method which term set to use. Finally, Line #51 is bolded because it's added to the CreateContentType method on Line #41. When making Structure updates, you can add fields to content types this way, or, following the "Create if not already there" pattern, use AddColumns to add net few fields to the content type this method extends. (We'll see this method in detail a bit later.)

And there we have it! Republish DDD.Web (which sucks in and GACs DDD.Common as well) and kick off DoEverythinger.ps1. When that's done, you'll be able to see our taxonomy field on the Article Rollup content type. Making Managed Metadata your bitch isn't that hard! The only thing to add is, as I've mentioned before, if you find your taxonomy constantly growing and changing during development, consider putting all this logic into its own feature, or possibly its own WSP so you can push these changes without disrupting your Structure code.

Fixing The Terms

Finally, let's look at our taxonomy fixer logic. This next bit has to be in a separate feature, since we only want to activate it after we restore our site collection from a backup when it contains taxonomy fields mapped to term sets that were created manually. I actually run into this a lot, since project plans typically have their term hierarchy built concurrently with the initial development cycle of a new intranet.

The general process in this situation is to have administrators and content authors create terms and pages live in production. Next, this is backed up and restored to the staging farm. Finally, the site collection is pushed down to your development/local environment. There you can regen any SPMetal, fix the taxonomy, and run the new Structure code against it. Finally, we deploy the new WSPs back up the chain. But first, let's look at FixTaxonomySiteColumn.

Code Listing 83: Utilities.cs

  1. using Tax = Microsoft.SharePoint.Taxonomy;
  2. ...
  3. public static bool FixTaxonomySiteColumn(this Tax.TaxonomySession session, SPWeb web, Guid fieldId, Guid noteFieldId, Guid termSetId, Guid groupId)
  4. {
  5. try
  6. {
  7. //get store
  8. if (session.DefaultKeywordsTermStore == null)
  9. throw new Exception("The DefaultKeywordsTermStore was not found.");
  10. //get group
  11. Tax.Group group = session.EnsureGroup(groupId, string.Empty);
  12. if (group == null)
  13. throw new Exception(string.Format("The {0} group was not found.", groupId));
  14. //get termset
  15. Tax.TermSet termSet = session.EnsureTermSet(group, termSetId, string.Empty, null);
  16. if (termSet == null)
  17. throw new Exception(string.Format("The {0} term set was not found.", termSetId));
  18. //get taxonomy list
  19. SPList list = web.Lists.TryGetList(Constants.Lists.Taxonomy);
  20. if (list == null)
  21. throw new Exception(string.Format("The {0} list was not found.", Constants.Lists.Taxonomy));
  22. //get taxonomy field
  23. Tax.TaxonomyField field = web.TryGetField(fieldId) as Tax.TaxonomyField;
  24. if (field == null)
  25. throw new Exception(string.Format("The {0} field was not found.", fieldId));
  26. //remap to term set
  27. field.AnchorId = Guid.Empty;
  28. field.TermSetId = termSet.Id;
  29. field.TextField = noteFieldId;
  30. field.SspId = termSet.TermStore.Id;
  31. field.TargetTemplate = string.Empty;
  32. //remap to hidden taxonomy list
  33. field.SchemaXml = field.SchemaXml.Replace(field.LookupList, list.ID.ToString());
  34. field.SchemaXml = field.SchemaXml.Replace(field.LookupWebId.ToString(), web.ID.ToString());
  35. //save
  36. field.Update(true);
  37. return true;
  38. }
  39. catch (Exception ex)
  40. {
  41. //TODO: log error
  42. //unsuccessful
  43. return false;
  44. }
  45. }

Once again, in Line #1, I always alias Microsoft.SharePoint.Taxonomy to avoid namespace collisions. Line #8 is a check to make sure the default term store is available. In the next few blocks, notice that this method is very temperamental when it comes to null checking the taxonomy API from the DefaultKeywordsTermStore to the term set and even the hidden taxonomy list.

This is due to the nature of the scenario when this method is relevant: broken taxonomy fields. Since we can't be positive of the cause of the issue, (and as we've seen, there are many things that can go wrong with the Managed Metadata service application) we need to be very careful not to have one little exception take down the entire feature upon activation. To this end, we basically return true if the code runs to completion, and false otherwise.

After retrieving the field in Line #23, the rest of the method goes through the exercise of remapping it to the desired term set and re-hooking it to the proper taxonomy list. Line #'s 33 and 34 are interesting, where we literally replace the old guids of the list and web id (for the taxonomy list) with the new ones. This is another potential breaking point for site collection restorations: taxonomy lists with different guids.

Next, we need to invoke this bad boy. Like I said, this needs to be separated from Structure, or, if you're refactoring all taxonomy into its own WSP, it clearly will be placed there. So create a new feature called "DDD TaxonomyFixer" in DDD.Web, (or DDD.Taxonomy or whatever) scope it to a site collection, and add its guid to Constants. Then turn off the auto activation stuff in the feature's XML file, configure the pretty name and description, create a feature receiver, and implement it as follows:

Code Listing 84: TaxonomyFixer.EventReceiver.cs

  1. using Tax = Microsoft.SharePoint.Taxonomy;
  2. ...
  3. public override void FeatureActivated(SPFeatureReceiverProperties properties)
  4. {
  5. //get site and web
  6. SPSite site = ((SPSite)properties.Feature.Parent);
  7. SPWeb root = site.RootWeb;
  8. //get taxonomy session
  9. Tax.TaxonomySession session = new Tax.TaxonomySession(site);
  10. //open fresh web for each content type
  11. using (SPWeb freshWeb = site.OpenWeb(root.ID))
  12. {
  13. //remove column from content type
  14. SPContentType ct = freshWeb.TryGetStandardContentType(ContentTypeId.Page.GetContentTypeId(Constants.ContentTypes.RollupArticle.Id));
  15. ct.RemoveColumn(Constants.SiteColumns.TaxField.Id, Constants.SiteColumns.TaxField.NoteId);
  16. //fix column
  17. session.FixTaxonomySiteColumn(
  18. root,
  19. Constants.SiteColumns.TaxField.Id,
  20. Constants.SiteColumns.TaxField.NoteId,
  21. Constants.Taxonomy.TermSetId,
  22. Constants.Taxonomy.GroupId);
  23. //add back to content type
  24. ct.AddColumns(freshWeb.TryGetField(Constants.SiteColumns.TaxField.Id));
  25. }
  26. }

This code uses the "Update an SPPersistedObject from a fresh web" pattern we discussed earlier in this section. After opening a fresh web in Line #11, we then grab the content type that holds our taxonomy field, remove the field, fix the field, and finally re-add the field. This is everything needed to fully rewire a taxonomy column. And don't worry: it doesn't wipe out the underlying SPListItem's field's value!

In Line #17, I don't use the returned Boolean from FixTaxonomySiteColumn in this example. It's more apropos when you're updating a bunch of columns and want to implement some sort of transaction, or if you're interested in logging your Structure's operations. It's really just a courtesy return; we still want to execute Line #24 regardless, otherwise we'll permanently have the column missing from the content type.

Take a look at Line #14, which calls GetContentTypeId. This is an extension of SPContentType in Utilities that starts with the parent content type id and welds our Constants guid onto it. Since we can't store SPContentTypeIds in Constants (we technically could; there's just no way to get them at design time) I came up with this little ditty to help us out. We can chain several calls to GetContentTypeId together to get any content type id we need, as long as we know one parent somewhere in the hierarchy. This way, we only have to store a single guid that we can generate at design time.

Code Listing 85: Utilities.cs

  1. public static SPContentTypeId GetContentTypeId(this SPContentTypeId parentId, Guid id)
  2. {
  3. //return
  4. return new SPContentTypeId(string.Concat(parentId, "00", id.ToString().Replace("-", string.Empty)));
  5. }

Advanced versions of this event receiver can load all content types and iterate through and fix all taxonomy fields in them via fresh webs. Also, notice that we're not wiring this feature activation into any of our deployment scripts. Like I said, this feature was designed to be activated, deactivated, and re-activated manually after a site collection restoration if needed; there's no reason to perform this sort of plastic surgery on our content types if their taxonomy fields aren't broken!

That does it for taxonomy deployment in All Code. I was probably a bit harsh on Managed Metadata; it really does its job well when you're nice to it. Knowing where its shortcomings are and how to deploy around them makes all the difference. It really did fill a major hole in SharePoint when it was introduced back in 2010. In fact, it's hard for me to envision a data/search-driven intranet without it!

[Next]
Loading...