Home » Task Agent

Category Archives: Task Agent

Periodically Rebuild Link Databases using an Agent in Sitecore

Last week a colleague had asked me whether rebuilding the Link Database would solve an issue she was seeing. That conversation got me thinking: wouldn’t it be nice if we could automate the rebuilding of the Link Database for each Sitecore database at a scheduled time?

I am certain others have already created solutions to do this — if you know of any, please share in a comment — but I didn’t conduct a search to find any (I normally advocate not reinventing the wheel for code solutions but wanted to have some fun building a new solution).

In the spirit of my post on putting Sitecore to work for you, I built the following Sitecore agent (check out John West’s blog post on Sitecore agents to learn more):

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Xml;

using Sitecore.Configuration;
using Sitecore.Data;
using Sitecore.Diagnostics;
using Sitecore.Jobs;

namespace Sitecore.Sandbox.Tasks
{
    public class RebuildLinkDatabasesAgent
    {
        private static readonly IList<Database> Databases = new List<Database>();
        private static readonly Stopwatch Stopwatch = Stopwatch.StartNew();

        public void Run()
        {
            JobManager.Start(CreateNewJobOptions());
        }

        protected virtual JobOptions CreateNewJobOptions()
        {
            return new JobOptions("RebuildLinkDatabasesAgent", "index", Context.Site.Name, this, "RebuildLinkDatabases");
        }

        protected virtual void RebuildLinkDatabases()
        {
            Job job = Context.Job;
            try
            {
                RebuildLinkDatabases(Databases);
            }
            catch (Exception ex)
            {
                job.Status.Failed = true;
                job.Status.Messages.Add(ex.ToString());
            }

            job.Status.State = JobState.Finished;
        }

        private void RebuildLinkDatabases(IEnumerable<Database> databases)
        {
            Assert.ArgumentNotNull(databases, "databases");
            foreach (Database database in databases)
            {
                Stopwatch.Start();
                RebuildLinkDatabase(database);
                Stopwatch.Stop();
                LogEntry(database, Stopwatch.Elapsed.Milliseconds);
            }
        }

        protected virtual void RebuildLinkDatabase(Database database)
        {
            Assert.ArgumentNotNull(database, "database");
            Globals.LinkDatabase.Rebuild(database);
        }

        protected virtual void LogEntry(Database database, int elapsedMilliseconds)
        {
            Assert.ArgumentNotNull(database, "database");
            if (string.IsNullOrWhiteSpace(LogEntryFormat))
            {
                return;
            }

            Log.Info(string.Format(LogEntryFormat, database.Name, elapsedMilliseconds), this);
        }

        private static void AddDatabase(XmlNode configNode)
        {
            if (configNode == null || string.IsNullOrWhiteSpace(configNode.InnerText))
            {
                return;
            }

            Database database = TryGetDatabase(configNode.InnerText);
            if (database != null)
            {
                Databases.Add(database);
            }
        }

        private static Database TryGetDatabase(string databaseName)
        {
            Assert.ArgumentNotNullOrEmpty(databaseName, "databaseName");
            try
            {
                return Factory.GetDatabase(databaseName);
            }
            catch (Exception ex)
            {
                Type agentType = typeof(RebuildLinkDatabasesAgent);
                Log.Error(agentType.ToString(), ex, agentType);
            }

            return null;
        }

        private string LogEntryFormat { get; set; }
    }
}

Logic in the class above reads in a list of databases set in a configuration file, adds them to a list for processing — these are only added to the list if they exist — and rebuilds the Link Database in each via a Sitecore job.

I added some timing logic to see how long it takes to rebuild each database, and capture this information in the Sitecore log.

I then wired up the above class in Sitecore using the following patch include configuration file:

<?xml version="1.0" encoding="utf-8"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <scheduling>
      <agent type="Sitecore.Sandbox.Tasks.RebuildLinkDatabasesAgent" method="Run" interval="00:01:00">
        <databases hint="raw:AddDatabase">
          <database>core</database>
          <database>master</database>
          <database>web</database>
        </databases>
        <LogEntryFormat>Rebuilt link database: {0} in {1} milliseconds.</LogEntryFormat>
      </agent>
    </scheduling>
  </sitecore>
</configuration>

I’ve set this agent to run every minute for testing, but it would probably be wise to have this run no more than once or twice a day.

After waiting a bit, I saw the following in my Sitecore log:

rebuilt-link-database

I do question the rebuild times. These seem quite small, especially when it takes a while to rebuild the Link Databases via the Sitecore Control Panel. If you have any ideas/thoughts on why there is an incongruence between the times in my log and how long it takes to rebuild these via the Sitecore Control Panel, please share in a comment.

Further, if you have any recommendations on making this code better, or have other ideas on automating the rebuilding of Link Databases in Sitecore, please drop a comment.

Until next time, have a Sitecoretastic day!

Put Sitecore to Work for You: Build Custom Task Agents

How many times have you seen some manual process and thought to yourself how much easier your life would be easier if that process were automated?

Custom Sitecore task agents could be of assistance on achieving some automation in Sitecore.

Last night, I built a custom task agent that deletes “expired” items from the recycle bin in all three Sitecore databases — items that have been sitting in the recycle bin after a specified number of days.

I came up with this idea after remembering an instance I had seen in the past where there were so many items in the recycle bin, finding an item to restore would be more difficult than finding a needle in a haystack.

Using .NET reflector, I looked at how Sitecore.Tasks.UrlAgent in Sitecore.Kernel.dll was coded to see if I had to do anything special when building my agent — an example would be ascertaining whether I needed to inherit from a custom base class — and also looked at code in Sitecore.Shell.Applications.Archives.RecycleBin.RecycleBinPage in Sitecore.Client.dll coupled with Sitecore.Shell.Framework.Commands.Archives.Delete in Sitecore.Kernel.dll to figure out how to permanently delete items in the recycle bin.

After doing my research in those two assemblies, I came up with this custom task agent:

using System;
using System.Collections.Generic;
using System.Linq;

using Sitecore.Configuration;
using Sitecore.Data;
using Sitecore.Data.Archiving;
using Sitecore.Diagnostics;

namespace Sitecore.Sandbox.Tasks
{
    public class RecycleBinCleanupAgent
    {
        private const string ArchiveName = "recyclebin";
        private static readonly char[] Delimiters = new char[] { ',', '|' };

        public IEnumerable<string> DatabaseNames { get; set; }

        private IEnumerable<Database> _Databases;
        private IEnumerable<Database> Databases
        {
            get
            {
                if (_Databases == null)
                {
                    _Databases = GetDatabases();
                }

                return _Databases;
            }
        }

        public int NumberOfDaysUntilExpiration { get; set; }

        public bool Enabled { get; set; }

        public bool LogActivity { get; set; }

        public RecycleBinCleanupAgent(string databases)
        {
            SetDatabases(databases);
        }

        private void SetDatabases(string databases)
        {
            Assert.ArgumentNotNullOrEmpty(databases, "databases");
            DatabaseNames = databases.Split(Delimiters, StringSplitOptions.RemoveEmptyEntries).Select(database => database.Trim()).ToList();
        }

        public void Run()
        {
            if (Enabled)
            {
                RemoveEntriesInAllDatabases();
            }
        }
        
        private void RemoveEntriesInAllDatabases()
        {
            DateTime expired = GetExpiredDateTime();

            if (expired == DateTime.MinValue)
            {
                return;
            }

            foreach (Database database in Databases)
            {
                RemoveEntries(database, expired);
            }
        }

        private DateTime GetExpiredDateTime()
        {
            if (NumberOfDaysUntilExpiration > 0)
            {
                return DateTime.Now.AddDays(-1 * NumberOfDaysUntilExpiration).ToUniversalTime();
            }

            return DateTime.MinValue;
        }

        private void RemoveEntries(Database database, DateTime expired)
        {
            int deletedEntriesCount = RemoveEntries(GetArchive(database), expired);
            LogInfo(deletedEntriesCount, database.Name);
        }

        private static int RemoveEntries(Archive archive, DateTime expired)
        {
            IEnumerable<ArchiveEntry> archiveEntries = GetAllEntries(archive);
            int deletedEntriesCount = 0;

            foreach (ArchiveEntry archiveEntry in archiveEntries)
            {
                if (ShouldDeleteEntry(archiveEntry, expired))
                {
                    archive.RemoveEntries(CreateNewArchiveQuery(archiveEntry));
                    deletedEntriesCount++;
                }
            }

            return deletedEntriesCount;
        }

        private static IEnumerable<ArchiveEntry> GetAllEntries(Archive archive)
        {
            Assert.ArgumentNotNull(archive, "archive");
            return archive.GetEntries(0, archive.GetEntryCount()); ;
        }

        private static bool ShouldDeleteEntry(ArchiveEntry archiveEntry, DateTime expired)
        {
            Assert.ArgumentNotNull(archiveEntry, "archiveEntry");
            Assert.ArgumentCondition(expired > DateTime.MinValue, "expired", "expired must be set!");
            return archiveEntry.ArchiveDate <= expired;
        }

        private static ArchiveQuery CreateNewArchiveQuery(ArchiveEntry archiveEntry)
        {
            Assert.ArgumentNotNull(archiveEntry, "archiveEntry");
            return CreateNewArchiveQuery(archiveEntry.ArchivalId);
        }

        private static ArchiveQuery CreateNewArchiveQuery(Guid archivalId)
        {
            Assert.ArgumentCondition(archivalId != Guid.Empty, "archivalId", "archivalId must be set!");
            return new ArchiveQuery { ArchivalId = archivalId };
        }

        private void LogInfo(int deletedEntriesCount, string databaseName)
        {
            bool canLogInfo = LogActivity
                              && deletedEntriesCount > 0 
                              && !string.IsNullOrEmpty(databaseName);

            if (canLogInfo)
            {
                Log.Info(CreateNewLogEntry(deletedEntriesCount, databaseName, ArchiveName), this);
            }
        }

        private static string CreateNewLogEntry(int expiredEntryCount, string databaseName, string archiveName)
        {
            return string.Format("{0} expired archive entries permanently deleted (database: {1}, archive: {2})", expiredEntryCount, databaseName, archiveName);
        }

        private IEnumerable<Database> GetDatabases()
        {
            if (DatabaseNames != null)
            {
                return DatabaseNames.Select(database => Factory.GetDatabase(database)).ToList();
            }

            return new List<Database>();
        }

        private static Archive GetArchive(Database database)
        {
            Assert.ArgumentNotNull(database, "database");
            return ArchiveManager.GetArchive(ArchiveName, database);
        }
    }
}

Basically, it loops over all recycle bin archived entries in all specified databases after an allotted time interval — the time interval and target databases are set in a patch config file you will see below — and removes an entry when its archival date is older than the minimum expiration date — a date I derive from the NumberOfDaysUntilExpiration setting:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <scheduling>
      <agent type="Sitecore.Sandbox.Tasks.RecycleBinCleanupAgent, Sitecore.Sandbox" method="Run" interval="01:00:00">
        <param desc="databases">core, master, web</param>
        <NumberOfDaysUntilExpiration>30</NumberOfDaysUntilExpiration>
        <LogActivity>true</LogActivity>
        <Enabled>true</Enabled>
      </agent>
    </scheduling>
  </sitecore>
</configuration>

I had 89 items in my master database’s recycle bin before my task agent ran:

before-recycle-bin-agent-runs-master

I walked away for a bit to watch some television, eat dinner, and surf Twitter for a bit, and phone my brother. I then returned to see the following in my Sitecore log:

after-recycle-bin-agent-runs-master-log

I went into the recycle bin in my master database, and saw there were 5 items left after my task agent executed:

after-recycle-bin-agent-runs-master

As you can see, my custom task agent deleted expired recycle bin items as designed. 🙂