Thursday, September 03, 2009

Partial Cache Clearing in Sitecore

For a very long time, some very serious performance issues have affected certain types of Sitecore deployments. It has to do with scaling a Sitecore solution into webfarms (multiple content delivery servers) and the use of the Sitecore Staging Module.

 

In very short summary, if the website is updated frequently (say, for instance, an editorial staff of 10 or so, posting relevant market updates) AND the website is under heavy load, most practical uses of the tool Sitecore recommends and supports will more or less kill performance on your SQL Servers or whatever storage mechanism you have in place.

 

I’ve never blogged in detail about this issue, as I didn’t have a client myself that was affected directly by this issue. Paul George has blogged about it in detail however, and if you want to learn more about what this issue is and if it could be affecting you, take a look at these posts:

 

 

So why write about it now?

 

The good news

Alex Shyba posted about a new Sitecore Shared Source module that was just made publically available; SitecoreStager – Sitecore Partial (item) Cache Clearing Module.

 

In short, instead of just uncritically clearing the entire cache on your target server, dropping user sessions, putting out the cat and so on – the SitecoreStager will instead execute a partial cache clear and in essence just clear items from the cache that were affected by the publishing operation.

 

This is very good news indeed, and for those clients who have been affected by this problem I am sure it’s tipped hats all ‘round :-)

 

The bad news

This is actually something that has been nagging me a bit for some time now, and has just been refreshed by this release.

 

It is Shared Source.

 

Don’t get me wrong. I think the Sitecore Shared Source Library is an excellent idea. I have code in there myself, and it’s a perfect place to go look for those “there has to be someone else who had this problem and solved it” code snippets and field types and whatever it may be.

 

But it is also, for obvious reasons, unsupported by Sitecore themselves. I mean, how could they?  Half the modules and code pieces in there come from independent sources like myself – and it wouldn’t be reasonable to expect Sitecore to support them.

 

But what about the code Sitecore release to the library themselves?

 

Would you not agree; “With Sitecore, you can have a team of editors publishing content to an enterprise level site, and performance will still rock” is a statement you’ll find (albeit probably not verbatim) in Sitecore marketing material?   But it is unsupported?

 

I feel somewhat the same for the Multiple Sites Manager…  shouldn’t a standard Sitecore have a solution for regular (advanced) editors to set up new websites without having to edit configuration files?  Or is that limited to Foundry licenses only?

 

What I mean is

Sitecore marketing materials doesn’t exactly help anyone much, when they boast “Comes with Blog Modules, Wiki Modules and even Microsoft Dynamix and Microsoft BizTalk integration (yes…)”; yet offers little or no actual support other than pointing the users, the licensees, the Sitecore implementors (like myself) in the direction of the Shared Souce Library, shrug and go… “From here, you’re on your own”.

 

Kudos for making this module. I think this belongs in the core package however, fully supported and updated with every Sitecore release. That’s my 2 cents worth.

 

 

.

Tuesday, June 23, 2009

Code Monkeys?

Before I begin, I’m breaking routine here – this post is in no way Sitecore related.

 

So here I was, reading through my blogroll, catching up on bits and bobs from around the world. And then one of the sites decides to confront me with the following ad:

 

chimps

 

I am, by own admission, quite possibly a bit over sensitive to the idea that developers are interchangeable code monkeys. I’ve worked in the internet industry pretty much since the inception in the mid 90s, and I will never subscribe to the notion that one developer (me) can easily be replaced by a developer working offshore for as low as $ USD 7.00 a day.

 

But then again, maybe that’s just me ;-)

 

Am I missing a point here?

Friday, June 05, 2009

Just because you can doesn’t mean you should

For those of you who don’t know this; I make my living as a Sitecore Professional Services Consultant. Understand this in the context of working with the Sitecore product as a consultant, I don’t actually work for Sitecore the company.

 

My job, if you will, consists primarily of establishing contact with newly started Sitecore Partners or want-to-become Partners, and bringing to them whatever skills I can offer to help them take those first shaky steps when bringing their first Sitecore Project to life.

 

Don’t worry, am not going to start any kind of sales pitch here, that’s not what blogs this blog are is for. I’m only telling, so you know what sort of context this post is in and where my experiences are coming from.

 

As you can maybe imagine (or even remember still?); the myriad of questions in relation to Sitecore coming from developers, sales people, project managers and so on are many and not far between. Nothing wrong with that, obviously, we all have to learn. Mostly the questions start out around the capabilities of the product, “can” questions. After this, they move into “how”, and ultimately (in the cases where I’m actually lucky enough to be around for the entire project) “when”.

 

"Can” questions

Initially, when requirements are being gathered, it’s a heap of “Can Sitecore do this?” type of questions. To name just a very few examples:

 

  • Can you integrate Sitecore with SharePoint?
  • Can you use the AJAX toolkit on a Sitecore site?
  • Can Sitecore deliver Flash content?
  • Can we build a database of our people and offices in Sitecore and have them shown on the site?
  • Can we have search options on our site?
  • Can Sitecore integrate with Google Analytics?
  • Can we implement breadcrumbs with Sitecore?

 

There are, of course, many more questions that are usually asked – as I’m sure any member of the Sitecore sales team will tell you as well. Incidentally, the answer to all of the above questions is “yes”.

 

And now we’re getting to the point. Working with Sitecore – how often is it, that you actually have to sat “No”. “Sorry. Can’t do that with Sitecore”?   While I have no statistics on it, I can still state without shaking my hand that this very rarely happens. Sitecore allows you to say “Yes. Can do” to almost any requirement, no matter how far fetched it may seem or even be.

 

Let’s not get too carried away, however, this is not quite as amazing as it might appear on the surface. If we boil it down to the bone, Sitecore can be described as an “ASP.NET application that adds Content Management services to your ASP.NET websites”. Right – I’m sure there’s a hundred different marketing spins to be made here, but bear with me… ;-) 

 

Sitecore is an ASP.NET application. It sits, talks, walks and sounds like an ASP.NET application. And here’s the good bit – and the reason you can answer “yes” to almost any requirement – anything you can do in ASP.NET, you can continue doing – Sitecore or no Sitecore involved.

 

Even the Sitecore product itself is flexible to the bone - (almost) everything is based on configuration files, and they can be tweaked and twisted or completely rewritten – in case there’s a particular feature you are missing, or if there’s a certain way Sitecore works that you want to change or even remove entirely. A blessing, right?

 

Wrong!  And on after-thought; “Wrong… mostly!”

 

Don’t get me wrong. I totally get where Sitecore is coming from, in wanting to create as flexible a platform as possible. As any software vendor on the market; the more requirements you can say “yes” to, the more sales you are likely to get. This is as simple and obvious as can be, really.

 

I’m just saying; maybe “Yes. But…” is a better answer. Let’s take a look at some more “can” questions. These are not fictional by the way, they are real questions asked by real people.

 

“Can” questions part 2

 

  • Can you rewrite how Sitecore creates new items, so that all language versions are created instead of just the current language you are editing?
  • Can you use ASP.NET Master Pages and Content Pages with Sitecore?   Does Sitecore support nested Master Pages?
  • Can you work with Web Parts in Sitecore?
    • This one deserves a closer examination; and after doing that what is usually really asked here is, “Can I put my page into edit mode, and add/remove Web Parts on different placeholder areas of the page?”
  • Can I use Microsoft Word to edit my pages?
  • Can I use the Microsoft MVC Framework with Sitecore?

 

Do you see where this is going?

 

What are these people really asking for?  Features?

 

Nope. In the vast majority of cases, these are not feature requests. These people are looking for familiarity.

 

Yep. That’s right. Familiar ground to stand on. Something known, as opposed to something unknown. Can we blame them?  Absolutely not. I think we all have a bit of fear of the unknown, to one degree or another.

 

And of course, Sitecore is fully aware of this as well. They have released features specifically to address some of these very questions already, and there are more to come. That way, we can continue saying “yes” and everyone wins. Just keep this point in mind however – familiarity. Not features. More often than not.

 

The solution

I guess the word “solution” is not really appropriate, as it seems to indicate there was a problem to begin with. Look behind the question. Find out why it is being asked in the first place, and then work from there.

 

Why does the person want to change how Sitecore creates language versions?

 

Turns out, this is apparently how other major CMS vendors do it. Now I happen to think Sitecore is doing this the right way, but this person came from a different background with different experience.   He was looking for a way to solve (amongst other things) content translation flows – and was maybe not aware of the various tools and gadgets that Sitecore provide for this purpose.

 

Master Pages?

 

Think “overworked .NET developer who really cannot be bothered to try and figure out how Sitecore constructs the pages it delivers”.

 

I certainly get where he or she is coming from. But don’t go chasing down this road – sit down and show (don’t tell; show) how a Sitecore “Layout” and an ASP.NET “Master Page” is more or less essentially the same thing. (I know…  but really – in most cases, this is just semantics).   So “placeholders”, not “content placeholders”. “Layouts”, not “Master Pages”.

 

Am sure you’re seeing the pattern here, by now :-)

 

All I’m saying is; “Yes. You absolutely CAN reconfigure Sitecore, so that security is completely disabled, users and roles come off a scan of your table napkin, your coffee machine starts automatically when you publish AND it will even offer background music when content editing”. Ok I’m being ironic, obviously, but see the point is this.

 

Just because you can doesn’t mean you should

 

Don’t go mucking about with Sitecore, jumping through hoops to make it act a certain way. Chances are – and most times – you aren’t dealing with a real requirement at all. All too often have I arrived at a “young” Sitecore shop and seen a development team dive right into a complete reconfiguration of how Sitecore works.  Pipelines altered and skewed and tonnes of bespoke code developed because “Otherwise we couldn’t meet our clients requirements”.  Err…  WHAT requirements were those exactly?

 

I mean, come on. I’ve done lots and lots of Sitecore sites now – and I can count the number of times I’ve actually had to modify standard Sitecore behaviour on maybe one hand.  This does not include adding new modules or field types, or anything like that – I’m talking about core changes such as altering the publishing pipeline, messing about with security resolvers and so on.

 

Not pointing fingers at anyone in particular here. If I were, I would have to point to myself first – I’m as guilty of trying to change something from what it is into something that is familiar as I gather anyone else is.

 

I just think it’s something we should all keep in mind.

 

And oh… before I end.

 

Sometimes the question DOES lead to a “legitimate” feature request ;-)

 

 

-

Friday, May 15, 2009

Listing “Related Articles” with Sitecore using the LinkDatabase

Seems I am on a writing streak this week. Am taking a week off, you see, from my normal everyday Sitecore Consulting, and seem to have a bit of time on my hands to catch up on some of all the posts I’ve been meaning to write for a while. Don’t worry; after this I will probably be way too busy again for a while to find time to post ;-)

 

So I catching up on StackOverflow the other day, and an interesting question was posed; “How to find related items by tags in Lucene.NET”.

 

And while there probably IS a way to actually do this with Lucene.NET; I remember my initial thought was “but why go through all the hassle of configuring and setting it up to do this?”. Not only would it matter things from an Operations point of view; it would require more code and more code that was completely dependant on specific configuration settings in the Lucene indexes.

 

Now, let me be very clear, I am no big expert on Lucene. There are many of you out there who know it well, and would probably be able to cook up a solution to answer the guys question using it. As for myself, I try to keep as much arcane configuration out of any project I am involved in – especially to solve a problem such as this, where Sitecore pretty much gives you the tools you need to solve it straight out of the box.

 

So anyway. Guy was asking in a Lucene context, but was looking for proposals. And I decided to give it a whirl, mocked up some pseudo code to solve the problem, and that was that. But see; everyone can write pseudo-code :P   And it’s only fair I put my err… code where my mouth is, and write up a real example of how this can be achieved in a manner I explained. Here goes.

 

Setting it up in Sitecore

I start by making up two templates:

 

1) “Simple Value”, which will be used to organise the meta tags I will be drawing upon.

It has no fields.

 

2) “Article”, which I will use to demonstrate how to implement “Related Articles” functionality.

image

 

I then set up a meta-structure that I will be using to tag up my articles, and ultimately draw out related articles. I don’t fill out the entire structure, nor do I mean to imply this structure is perfect. But it is enough to demonstrate the point, and should be easy enough to follow. All the tags are based on the “Simple Value” template.

 

image

 

After this, I go through the somewhat tedious task of setting up a number of articles that are tagged in different ways.

 

For now, I type and tag in 7 articles; like this:

 

Name: Ben Hur

Tags: O2 Arena, Theatre

 

Name: Britney Spears

Tags: O2 Arena, Pop, Concert

 

Name: Depeche Mode

Tags: O2 Arena, Alternative, Concert

 

Name: Michael Jackson

Tags: O2 Arena, Pop, Concert

 

Name: Nickelback

Tags: O2 Arena, Rock, Concert

 

Name: Pet Shop Boys

Tags: O2 Arena, Pop, Concert

 

Name: War of the Worlds

Tags: O2 Arena, Theatre

 

I should probably go on for a while longer if I really wanted to go all-out in demonstrating this. However, I do have enough now, and it’ll have to do. I hate typing in test data ;-)

 

Before I go on, I should explain exactly how I intend to deduce what “related articles” should be. It can be done and determined in many ways – but I am proceeding exactly in the manner that was originally in question on StackOverflow. The rule can be described as two statements:

 

1) An article is related if it shares one or more tags with the source article

2) The more tags it shares, the more relevant it becomes (i.e. should appear higher on the list)

 

Lastly, I set up a blank .ASPX page in my webroot named “TestRelated.aspx”, and I quickly mock up two DomainObjects that I will build upon for this functionality.

 

SimpleValue.cs

using CorePoint.DomainObjects.SC;
using CorePoint.DomainObjects;

namespace Website.Related
{
    [Template("user defined/simple value")]
    public class SimpleValue : StandardTemplate
    {
    }
}

 

Article.cs

using System;
using System.Collections.Generic;
using CorePoint.DomainObjects.SC;
using CorePoint.DomainObjects;

namespace Website.Related
{
    [Template("user defined/article")]
    public class Article : StandardTemplate
    {
        [Field("title")]
        public string Title { get; set; }

        [Field("text")]
        public string Text { get; set; }

        [Field("tags")]
        public List<Guid> Tags { get; set; }
    }
}

 

And finally, in my TestRelated.aspx.cs, I add a bit of code to test that everything is as expected.

public partial class TestRelated : System.Web.UI.Page
{
    protected void Page_Load( object sender, EventArgs e )
    {
        var director = new SCDirector();

        List<Article> articles = director.GetChildObjects<Article>( "/sitecore/content/global/articles" );
        foreach ( Article article in articles )
        {
            // Get the SimpleValues (name) from the tag Guids
            var simpleValues = article.Tags.ConvertAll<string>( a => 
                        { 
                            return director.GetObjectByIdentifier<SimpleValue>( a ).Name; 
                        } );
            StringBuilder sb = new StringBuilder();
            simpleValues.ForEach( sv => sb.Append( sv + ' ' ) );

            Response.Write( string.Format(
                             "Name: {0}<br />Tags: {1}<br /><br />",
                             article.Name,
                             sb.ToString() ) );
        }
    }
}

 

So far so good. I run the code, and I get a replica of the list I already showed:

 

Name: Ben Hur
Tags: O2 Arena Theater
Name: Britney Spears
Tags: Pop Concert O2 Arena
Name: Depeche Mode
Tags: O2 Arena Concert Alternative
Name: Michael Jackson
Tags: Pop Concert O2 Arena
Name: Nickelback
Tags: Rock Concert O2 Arena
Name: Pet Shop Boys
Tags: O2 Arena Concert Pop
Name: War of the Worlds
Tags: O2 Arena Musical

 

Excellent. After all this, I am now ready to proceed to the good stuff ;-)

 

Finding Related Articles using the Sitecore LinkDatabase

Having an Article entity in place, makes this an obvious place to add functionality such as Related Articles. I could either add it as a Lazy Load property named “Related Articles”, or I could write a method named “GetRelatedArticles()”. This is mostly down to aesthetics and practices; personally I prefer the first option.

 

I expand the Article.cs with a little bit of code. The original pseudo-code I suggested, is entered in comments, for reference.

private int _referenceCount;
List<Article> _RelatedArticles = null;
public List<Article> RelatedArticles
{
    get
    {
        if ( _RelatedArticles == null )
        {
            _RelatedArticles = new List<Article>();
            var referenceCount = new Dictionary<Guid, int>();

            // for each ID in tags
            foreach ( Guid id in Tags )
            {
                var sv = Director.GetObjectByIdentifier<SimpleValue>( id );

                // Personal note: In this particular instance, performance
                // could be gained here, but not loading up full articles
                // via DomainObjects but hitting the LinkDatabase directly instead

                // get all documents referencing this tag
                List<Article> articles = sv.GetReferrers<Article>();

                // for each document found
                articles.ForEach( a =>
                    {
                        if ( a.Id != Id )
                        {
                            // if master-list contains document; 
                            if ( referenceCount.ContainsKey( a.Id ) )
                                referenceCount[ a.Id ]++; // increase usage-count
                            else // else; 
                                // add document to master list
                                referenceCount[ a.Id ] = 1;
                        }
                    } );
            }

            // Now we have a list of all the relevant guids being referenced on all tags
            // on this article. Load them up, and stamp them with the reference count
            foreach ( var key in referenceCount.Keys )
            {
                var relatedArticle = Director.GetObjectByIdentifier<Article>( key );
                relatedArticle._referenceCount = referenceCount[ key ];
                _RelatedArticles.Add( relatedArticle );
            }
            
            // sort master-list by usage-count descending
            _RelatedArticles.Sort( ( a, b ) => b._referenceCount.CompareTo( a._referenceCount ) );
        }

        return _RelatedArticles;
    }
}

 

And to test if what I’m getting from this is what I expect, I also add some code to my TestRelated.aspx so it becomes:

protected void Page_Load( object sender, EventArgs e )
{
    var director = new SCDirector();

    List<Article> articles = director.GetChildObjects<Article>( "/sitecore/content/global/articles" );
    foreach ( Article article in articles )
    {
        // Get the SimpleValues (name) from the tag Guids
        var simpleValues = article.Tags.ConvertAll<string>( a => 
                    { 
                        return director.GetObjectByIdentifier<SimpleValue>( a ).Name; 
                    } );
        StringBuilder sb = new StringBuilder();
        simpleValues.ForEach( sv => sb.Append( sv + ", " ) );

        Response.Write( string.Format(
                         "Name: {0}<br />Tags: {1}<br />Related Articles: ",
                         article.Name,
                         sb.ToString() ) );

        article.RelatedArticles.ForEach( ra =>
                Response.Write( string.Format( "{0},", ra.Name ) ) );

        Response.Write( "<hr />" );
    }
}

 

And after all this, I am pleased to find a result looking like:

 

Name: Ben Hur
Tags: O2 Arena, Theater,
Related Articles: Michael Jackson,Britney Spears,Depeche Mode,Nickelback,Pet Shop Boys,War of the Worlds,


Name: Britney Spears
Tags: Pop, Concert, O2 Arena,
Related Articles: Michael Jackson,Pet Shop Boys,Depeche Mode,Nickelback,Ben Hur,War of the Worlds,

Name: Depeche Mode
Tags: O2 Arena, Concert, Alternative,
Related Articles: Britney Spears,Michael Jackson,Nickelback,Pet Shop Boys,War of the Worlds,Ben Hur,
Name: Michael Jackson
Tags: Pop, Concert, O2 Arena,
Related Articles: Britney Spears,Pet Shop Boys,Depeche Mode,Nickelback,Ben Hur,War of the Worlds,

Name: Nickelback
Tags: Rock, Concert, O2 Arena,
Related Articles: Britney Spears,Depeche Mode,Pet Shop Boys,Michael Jackson,Ben Hur,War of the Worlds,
Name: Pet Shop Boys
Tags: O2 Arena, Concert, Pop,
Related Articles: Britney Spears,Michael Jackson,Depeche Mode,Nickelback,War of the Worlds,Ben Hur,

Name: War of the Worlds
Tags: O2 Arena, Musical,
Related Articles: Ben Hur,Britney Spears,Depeche Mode,Nickelback,Pet Shop Boys,Michael Jackson,

 

The first thing that strikes me is; my meta data and test data probably aren’t extensive enough to really see this functionality in full effect. They all look almost the same.

 

However, I can determine that it works as expected. “Britney Spears”, “Michael Jackson” and “Pet Shop Boys” all share the same 3 meta tags. They SHOULD in all instances suggest the “one left out” on top of the list as “Related Articles”.  And they all do; I’ve marked them in bold and underline.  Also note that the “Depeche Mode” concert in O2 Arena lists other concerts (although of different music genre) before it proceeds to list the musicals and theatre plays.

 

It works :-)

 

A few notes on performance

In this post, I’ve deliberately not focused excessively on performance implications. Don’t worry – it’s not at all bad. But in “real life”; there are still obvious places in this code where you could potentially gain a significant amount of performance. As everyone will know; I/O operations are by an order of magnitude some of the most expensive calls we can make, and there is definitely a few places you could set in here.

 

A few suggestions I would look into if I were to take this code live:

 

  • Code up a TagController; that will eventually act as a cache for all the tags in your solution. Load up the tags only once, and don’t repeatedly re-load them in your loops.
  • In this case, bypass the very convenient .GetReferrers() method provided by DomainObjects and go through the extra work of working with the LinkDatabase directly yourself. For this part of the algorithm (counting up how many times a given ID is referencing your tag), you don’t really need to load up the Sitecore Item – something .GetReferrers() will automatically do. I will put this on the TODO list for DomainObjects.
  • And – as ALWAYS – don’t forget to configure caching for whatever sublayouts and/or user controls you are calling this functionality on.

 

That’s it for this time. I hope you found this useful  :-)

Labels: , ,