Syndication

11/16/2011

10126 Views // 0 Comments // Not Rated

SharePoint Designer / Visual Studio Deployments Part Two: Fixing Taxonomy Metadata (And Other) Issues After A Site Collection Restore

Background

For the past month or so, I've been architecting a whirlwind SharePoint 2010 project. My current team is comprised of both SharePoint developers (working in Visual Studio) and SharePoint configurators (working in SharePoint Designer). This has made for a lot of interesting technical collaborations as well as debates over what should be implemented in code and what shouldn't. This post is part two of the tale; part one discussed the approach and technical details behind this collaboration and the nuances around deploying assets created in SharePoint Designer from Visual Studio WSPs.

Now that we've made it through the deployments, I'd like to use part two to share the pain points. I realize that discussing all the things that went wrong, in detail, seems pessimistic, but look at it this way: everything else worked perfectly! There are two main types of problems that resulted from this collaboration that I'd like to discuss: weirdness with SharePoint assets (page layouts, site columns, and content types) deployed as feature elements, and taxonomy issues.

Deployment Issues

Site Column Updates

The first issue we noticed was that the following love note would be displayed whenever a field was updated (either through the UI or programmatically):

Update Site Column

"The object has been updated by another user since it was last fetched."

Ugh. After pouring over the XML that defined these fields, I noticed something interesting:

Code Listing 1

<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
<Field Type="TaxonomyFieldTypeMulti" DisplayName="Across the Movement" List="{e85017b8-e168-44e1-addf-623d97242b7a}" WebId="f45e7816-bb4f-4e19-b2da-a924bed6b658" ShowField="Term1033" Required="FALSE" EnforceUniqueValues="FALSE" Mult="TRUE" Sortable="FALSE" Group="_Exchange Columns" ID="{2f9067c0-22f9-4354-ae21-e4dc35cd39d7}" SourceID="{f45e7816-bb4f-4e19-b2da-a924bed6b658}" StaticName="AcrossMovement" Name="AcrossMovement" Version="3" />
</Elements>

Scroll all the way to the right of Line #2. See the "Version" attribute? That was the culprit. From what I can intuit, this site column was updated via the UI three separate times in production. Then, after being backed up (by way of the site collection) and restored to development, even though the field thought it was in it's the third version, the content database only had one version (since backups don't capture versioning). I feel like this was confusing things; how would SharePoint load the third version of something if there's only a single one in the database? BOOM! I dropped this attribute, redeployed the WSP, and the issue went away.

Patch Up Code

After these assets were imported, all new content types and site columns were created via code, so that everything had the same ids in all environments. Whenever I had to reference an existing site column for a new content type, I had to create a constant holding its guid (taken from the proper Elements.xml file) so I could refer to the field by id. This way, there's no messing around with strings or column internal names. This type of SPD/VS integration code isn't bad at all

However, there were certain issues that required some more detailed logic to work around. A good example: I noticed that it's possible to deploy lookup fields via XML when the list they point to doesn't exist. This renders these columns useless. Refer back to the field markup above. There are guid-references to webs and lists that are not guaranteed to exist in a restored environment. I assumed that any such broken references would at least throw some kind of error when deployed; they silently fail instead.

And of course, you can't change the lookup list either via the UI or programmatically after the fact; even though SPFieldLookup.LookupList isn't read only, an exception is thrown if you try to set it (follow the previous link for details). So here's some code to adjust the lookup list of existing lookup columns:

Code Listing 2

//get web, field, and list
SPWeb web = SPContext.Current.Web;
SPFieldLookup field = web.Fields.GetFieldByInternalName("<internal name of lookup field>") as SPFieldLookup;
SPList list = web.Lists.TryGetList("<title of lookup list>");
//update lookup list id
field.SchemaXml = field.SchemaXml.Replace(field.LookupList, list.ID.ToString());
//update lookup web id
field.SchemaXml = field.SchemaXml.Replace(field.LookupWebId.ToString(), web.ID.ToString());
//save
field.Update(true);

By using the SPField.SchemaXml property, we can get the control we need to force the lookup list for the lookup field. Of course, you should only do this when the list reference is broken and the field is essentially dead; merely wanting to point a perfectly fine lookup field to a new list is bad news, as this will orphan a lot of your existing lookup data.

Although code like this is fun, it's also totally hacky. A way better approach is to provision the entire site collection programmatically so that everything is guid-consistant across all machines; that's a much larger conversation for a different time. And like I said, since this is hybrid team, all of our approaches, when it wasn't too difficult or political to disseminate a task strictly down the Visual Studio or SharePoint Designer route, will be hybrid as well. So configurators can do what they do out-of-the-box, and developers can do what they do to make it deployable. Code, therefore, is the glue that binds this whole wonderful mess together.

Page Layout Code Behind

Another thing that broke from part one: page layouts with code behind. We noticed that after provisioning a page with one of our layouts, SharePoint publishing sort of disappeared. The "Publish" tab vanished from the ribbon when the page was in edit mode, and all the nice versioning tools ("Save & Close," "Check In/Out," etc.) were reverted to the WSS-y equivalents, which are just "Save" and "Cancel." Well the problem turned out to be the code behind of our custom page layouts.

Any hand-rolled SharePoint page layout must inherit from PublishingLayoutPage. We all know this. What I didn't know was that this situation doesn't seem to support polymorphism. I created a class that inherits from PublishingLayoutPage, and had all my pages inherit from that. Although it "works" (inasmuch as it complies and you can provision pages with code behind) it breaks the publishing infrastructure as described above. After the issue was raised to me, on a hunch, I tore out the base class, reverted all inheritance back to PublishingLayoutPage, redeployed, and it worked.

I can only guess that when the publishing infrastructure interrogates the page layout for publishing support, it checks to see if it explicitly inherits from the proper base class, rather than if it's derived from it. This might be a bug, but I can see Microsoft writing code to not support polymorphism here. They can't make the class sealed, so instead it's more of a roundabout kludge to discourage us from such implementations that could break publishing. I think this is the case because of other "discouragements" I've seen in this corner of SharePoint, such as not supporting multiple page libraries in a single site. There are many ways to get common functionality working without polymorphism (interfaces, static utility methods, attribution, etc.) so this isn't a show stopper.

Taxonomy

Somehow, only just now have I delved into the world of SharePoint taxonomy; none of my other 2010 projects have required it. Based on what I've heard about and read about taxonomy, I assumed it was a nice little hierarchical list of terms with metadata, permissions, etc. But no. Instead it's this thing; it's this monstrosity that, in my opinion, is super complicated, super fragile, and, well, super difficult to work with for the unassuming SharePoint developer. What it does is great. However, the fact that there are hidden fields, obscure lookups, "catch all" columns, and other weirdness imply to me that although the thought was good, the implementation was rushed and sloppy.

But if SharePoint was judged strictly based on what was found under the hood and behind the scenes, the product would never have gained the popularity it now deserves and enjoys. In most cases, with out-of-the-box components, as long as it works, and the API isn't too horrible (if I need to customize things), I'm fine with it. However, if components are indeed poorly designed, then their flaws will quickly become your problems as soon as your application that depends on them wanders off the happy path.

The happy path divergence I'd like to discuss here is the persistence of managed metadata columns through site collection backups and restores. Like I said, the deployment from part one went well - or so I thought. It was only after backing up and restoring to our development environment, however, did all kinds of weird publishing errors around creating new pages, adding content types to lists, saving site columns, etc. begin to appear. We all know how often the following phrase is muttered after a deployment: "It erred in prod? But it worked fine in dev!" Well this scenario, interestingly enough, fell victim to the opposite of this common development problem.

I was able to track most of these issues down to managed metadata being extremely unmanageable. We all know that metadata columns have to be rewired to the proper term store as they are promoted through the environments. However, this problem was deeper: metadata columns seemed not only unhinged from their taxonomy, but from their content types as well. What happened is that the metadata fields didn't import properly into Visual Studio; the lookup column to the hidden taxonomy list was broken, and none of the hidden note fields came over.

Here are some of the error messages I came across:

Duplicate Site Column

"A duplicate field name "[field name]TaxHTField0" was found."

This happened whenever adding a new content type to a pages library. The best I can assume is that SharePoint was attempting to create a new hidden note field for the taxonomy column every time because one wasn't wired up proper. Here's another:

Metadata Field

"Failed to get the value of the "[field name]" column from the "Managed Metadata" field type control. See details in log. Exception message: Invalid field name. [id of field or internal name of field] [URL of site] /Pages."

This is caused by the metadata column not being wired up to the correct taxonomy term store. Such a scenario can also present itself by rendering "disabled" metadata field controls when the page is in edit mode (or you're viewing the page's properties):

Grayed Out Metadata

The fix to all these issues is two-fold: wire in the correct term store for each field, and drop/re-add all taxonomy columns from/to their content types. The only issue with this is that the values on the restored pages for these fields will be invalid, but since we're storing to development, we don't care. To get around this, provision all metadata (fields, term stores, and even the group) via code with static guids. I will present code that does this in a later publication.

I implemented this as a site collection feature that gets activated upon a restore. This code is never to be run in production, as it will result in the aforementioned loss of taxonomy metadata. I have separate PowerShell scripts for deployments to different environments to ensure I don't make this mistake. (I will also describe these scripts in a later post.) Let's take a look at the FeatureActivated method of the feature receiver:

Code Listing 3

#region Events
public override void FeatureActivated(SPFeatureReceiverProperties properties)
{
//initialization
SPSite site = properties.Feature.Parent as SPSite;
SPWeb web = site.RootWeb;
//get content types to fix
SPContentType page = web.TryGetStandardContentType(Constants.ContentType.CustomPageId);
SPContentType document = web.TryGetStandardContentType(Constants.ContentType.CustomDocumentId);
//remove metadata columns
this.RemoveColumn(page, Constants.SiteColumns.TaxColumnId, Constants.SiteColumns.TaxColumnNoteId);
this.RemoveColumn(page, Constants.SiteColumns.AnotherTaxColumnId, Constants.SiteColumns.AnotherTaxColumnNoteId);
this.RemoveColumn(document, Constants.SiteColumns.TaxColumnId, Constants.SiteColumns.TaxColumnNoteId);
this.RemoveColumn(document, Constants.SiteColumns.AnotherTaxColumnId, Constants.SiteColumns.AnotherTaxColumnNoteId);
//fix columns
this.FixMetadataColumn(site, Constants.SiteColumns.TaxColumnId, Constants.SiteColumns.TaxColumnNoteId, Constants.Taxonomy.CustomTermStore);
this.FixMetadataColumn(site, Constants.SiteColumns.AnotherTaxColumnId, Constants.SiteColumns.AnotherTaxColumnNoteId, Constants.Taxonomy.AnotherCustomTermStore);
//readd metdata
web.ExtendContentType(page.Id, Constants.SiteColumns.TaxColumnId, Constants.SiteColumns.AnotherTaxColumnId);
web.ExtendContentType(document.Id, Constants.SiteColumns.TaxColumnId, Constants.SiteColumns.AnotherTaxColumnId);
}
#endregion
#region Private Methods
private void RemoveColumn(SPContentType ct, Guid fieldId, Guid noteFieldId)
{
//remove tax field
TaxonomyField field = ct.ParentWeb.Fields[fieldId] as TaxonomyField;
if (field != null)
ct.FieldLinks.Delete(field.Id);
//delete note field
SPField noteField = ct.ParentWeb.Fields[noteFieldId];
if (noteField != null)
ct.FieldLinks.Delete(noteField.Id);
//update
ct.Update(true);
}
private void FixMetadataColumn(this SPSite site, Guid fieldId, Guid textFieldId, string termSetName)
{
//get session
TaxonomySession session = new TaxonomySession(site);
if (session.DefaultKeywordsTermStore == null)
throw new Exception("The DefaultKeywordsTermStore was not found.");
//get store
TermStore termStore = session.DefaultKeywordsTermStore;
if (session.DefaultKeywordsTermStore == null)
throw new Exception("The DefaultKeywordsTermStore was not found.");
//get group
Group group = termStore.Groups.Where(g => g.Name.Equals(Constants.Taxonomy.Group, StringComparison.InvariantCultureIgnoreCase)).FirstOrDefault();
if (group == null)
throw new Exception(string.Format("The {0} group was not found.", Constants.Taxonomy.Group));
//get termset
TermSet termSet = group.TermSets.Where(t => t.Name.Equals(termSetName, StringComparison.InvariantCultureIgnoreCase)).FirstOrDefault();
if (termSet == null)
throw new Exception(string.Format("The {0} term set was not found.", termSetName));
//get taxonomy list
SPList list = site.RootWeb.Lists.TryGetList(Constants.List.Taxonomy);
if (list == null)
throw new Exception(string.Format("The {0} list was not found.", Constants.List.Taxonomy));
//get taxonomy field
TaxonomyField field = site.RootWeb.Fields[fieldId] as TaxonomyField;
if (field == null)
throw new Exception(string.Format("The {0} field was not found.", fieldId));
//update properties
field.AnchorId = Guid.Empty;
field.TermSetId = termSet.Id;
field.TextField = textFieldId;
field.SspId = termSet.TermStore.Id;
field.TargetTemplate = string.Empty;
//update lookup
field.SchemaXml = field.SchemaXml.Replace(field.LookupList, list.ID.ToString());
field.SchemaXml = field.SchemaXml.Replace(field.LookupWebId.ToString(), site.RootWeb.ID.ToString());
//save
field.Update(true);
}
#endregion

There are a few things to note here. First of all, in Line #'s 8,9,19, and 20 you'll see two extension methods. TryGetStandardContentType extends an SPWeb to allow for the retrieval of an SPContentType from a SPContentTypeId. ExtendContentType simply adds new fields to existing content types.

The next block makes calls to RemoveColumn, which deletes the metadata column and hidden note field (if it exists at all) from the content type. Afterwards, all metadata columns as sent through the FixMetadataColumn method. This is the one that programmatically spins up the term store, wires up the taxonomy column to it, fixes the note field, and ensures the lookup. Finally, we add the fields back to the content types. The rest of the code should be pretty straight forward.

Conclusion

So that's it: these were the issues I've run into with hybrid deployments of assets from SharePoint Designer and Visual Studio. In the projects I've been on that have done all structure programmatically, deployments have been very smooth and repeatable. And with the understanding that it will have to be manually in multiple environments, my SharePoint configurators have also been able to provision portals in a timely manner with no major hang ups.

But as for these hybrid teams, it takes a lot of diligence to make everything (and everyone) work well together. Our decision points have come down to our project manager asking me if I thought I could code something and get it into a WSP faster than one our designers could do it manually three times directly on the server. Ugh. It's a lot of overhead to not only integrate page layouts, site columns, and content types into Visual Studio, but also to write patch up code keep everything wired together proper. However, having WSPs at the end of the day is, in my opinion, invaluable.

Taxonomy

Publishing

PowerShell

Team Collaboration

You need to login with Twitter to share a Thought on this post.

[Home]

[All Posts]

11/16/2011

10126 Views // 0 Comments // Not Rated

SharePoint Designer / Visual Studio Deployments Part Two: Fixing Taxonomy Metadata (And Other) Issues After A Site Collection Restore

Background

Deployment Issues

Site Column Updates

Code Listing 1

Patch Up Code

Code Listing 2

Page Layout Code Behind

Taxonomy

Code Listing 3

Conclusion

4 Tags

No Files

No Thoughts

Your Thoughts?