Take a Walk through your SharePoint Farm

When tasked with upgrading our long-neglected ‘intranet’ last year my first job was to work out just how much data was out there and what needed to be upgraded.

The masterpage had been butchered some time in the past so most of the pages were missing navigation, making it hard to follow sites down the hierarchy.  And what a hierarchy!  The architects of the original instance apparently worked out that you could have more than one document library per site, or that you could create folders.  The result is the typical site sprawl.  To add to the fun, some sites were created using some custom template that no longer works and others didn’t have any files at all in them.

In order to create a list of all the sites and how they relate, you can use a PowerShell script:

[code lang=”PowerShell”]

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint") > $null

function global:Get-SPSite($url){
return new-Object Microsoft.SharePoint.SPSite($url)
}

function Enumerate-Sites($website){
foreach($web in $website.getsubwebsforcurrentuser()){
[string]$web.ID+";"+[string]$web.ParentWeb.ID+";"+$web.title+";"+$web.url
Enumerate-Sites $web
$web.dispose()
}
$website.dispose()
}

#Change these variables to your site URL and list name
$siteColletion = Get-SPSite(http://example.org)

$start = $siteColletion.rootweb

Enumerate-Sites $start
$start.Dispose()
[/code]

It’s actually pretty simple since we take advantage of recursion.  Pretty much a matter of getting a handle to the site collection, outputting its GUID and parent site GUID and then the human-readable title and URL.  They you do the same for that site, and so on down the branch.

The reason we’re outputting the GUIDs is that we can use them to relate each site to the rest.  The script outputs straight to the console but if you pipe the output to a text file you can use it for the input to a org-chart diagram in visio.  The results are terrifying:

image

Each node on that diagram is a site that may have one, or thousands of documents.  Or nothing.  Or the site just may not work.  As it turned out, when asked to prioritize material for migration, the stakeholders decided it would be easier just to move the day-to-day stuff and leave the old farm going as an ‘archive’.  Nothing like asking a client to do work to get your scope reduced!

As a final note on this script, it is recursive so could (according to my first-year Comp. Sci. lecturer) theoretically balloon out of control and consume all the resources in the visible universe, before collapsing into an ultradense back hole and crashing your server in the process, but you’d have to have a very elaborate tree structure for that to happen, in which case you’d probably want to partition it off into separate site collections anyway.

More on Web Log Analysis

In my previous post on web log analysis, I described a Powershell wrapper script for LogParser.exe, which lets you do SQL-style queries to text logfiles.  Today I have another script which wraps that script and is used in a timer job to send the filtered logs to the client each month.

[sourcecode language=”powershell”]
#GenerateLogAnalysis will query IIS logfiles and output logs for PDF downloads from the first until
#the last day of the previous month

#function that performs the actual analysis
function RunLogAnalysis(){

$command = "c:\users\daniel.cooper\desktop\scripts\queryLogs.ps1 -inputFolder {0} -outputFile {1} -startDate {2} -endDate {3} -keyword {4}" -f $inputFolder, ($outputPath+$outputFile), (ConvertDateToW3C($startDate)), (ConvertDateToW3C($endDate)), "elibrary"
$command
invoke-expression $command

$emailBody = "<div style=""font-family:Trebuchet MS, Arial, sans-serif;""><img src=""http://www.undp.org/images/cms/global/undp_logo.gif"" border=""0"" align=""right""/><h3 style=""color:#003399;"">Log Analysis</h3>A log anaylsis has been run on the eLibrary for PDF files for "+$monthNames[$startDate.month-1]+" "+$startDate.Year+"<br/>Please find it attached."

sendEmail "recipient@example.org" "sender@example.org" "eLibrary Log Analysis: $outputFile" ($outputPath+$outputFile) $emailBody
}

function ConvertDateToW3C($dateToBeConverted){

return "{0:D4}-{1:D2}-{2:d2}" -f $dateToBeConverted.year, $dateToBeConverted.month, $dateToBeConverted.day;

}

function sendEmail($toAddress, $fromAddress, $subject, $attachmentPath, $body){

$SMTPServer = "yourMailServer"

$mailmessage = New-Object system.net.mail.mailmessage
$mailmessage.from = ($fromAddress)
$mailmessage.To.add($toAddress)
$mailmessage.Subject = $subject
$mailmessage.Body = $body

$attachment = New-Object System.Net.Mail.Attachment($attachmentPath, ‘text/plain’)
$mailmessage.Attachments.Add($attachment)

$mailmessage.IsBodyHTML = $true
$SMTPClient = New-Object Net.Mail.SmtpClient($SmtpServer, 25)
$SMTPClient.Send($mailmessage)
$attachment.dispose()
}

#Current Month
$currentDate = Get-Date
$localDateFormats = new-object system.globalization.datetimeformatinfo
$monthNames = $localDateFormats.monthnames
$localDateFormats.dispose
#Generate first day of last month as a date
$startDate = $currentDate.AddMonths(-1).addDays(-$currentDate.AddMonths(-1).day+1)

#Generate last day of last month as a date
$endDate = $currentDate.AddDays(-$currentDate.day)

#Set the initial parameters
$inputFolder = "c:\temp\www.snap"
$logName = "SNAP"
$outputFile = "LogAnalysis_"+$logName+"_"+$startDate.year+$monthNames[$startDate.month-1]+".csv"
$outputPath = "C:\Users\daniel.cooper\Desktop\"

RunLogAnalysis($inputFolder, $outputFile, $startDate, $endDate)
[/sourcecode]

What’s happening here is that RunLogAnalysis() is the main controller function.  What is does is set up the command to run the queryLogs.ps1 script mentioned in the previous post, waits until it’s run and then email the result off.  We have another function, ConvertDateToW3C, which takes a date-parsable string and converts it to W3C format, which is what LogParser.exe likes.  sendEmail() is pretty straightforward, it’s a generic email-sending function.

After the functions we have a little code to set up parameters.  My task was to email the client the last month’s logs for PDF downloads on the first of each month.  To do this we get last month’s name (for the output filename) , the date on the first of last month and the date on the last day of the last month.

After parameter generation is done, we perform the log analysis and email the result.  This is created as a scheduled task on the webserver and we’re done.

Migrating to SharePoint 2010

Upgrades can be a titanic pain and a platform with as many moving parts as SharePoint means you’re in for a lot of headaches.

If you’re upgrading Office or even Windows, it’s usually just a matter of sticking a DVD in your machine, hitting OK a few times and going for a coffee.

The first problem with upgrading to SharePoint 2010 is its requirements: it has to be on Windows Server 2008 64-bit, so you may find yourself upgrading the OS in the first place.

I’ve got a good idea!

Because of this, the upgrade task seems like a good opportunity to upgrade your hardware as well.  For example, we moved our two farms onto a single, virtualised farm.

The trouble starts at the planning stage.  If you’re moving from an old farm to a new one, you’re not upgrading you’re migrating and pretty much all the support out there is for upgrades.

Two Paths to Follow

SharePoint 2010 gives you two options for upgrading, a database attach upgrade or an in-place upgrade.  We’re doing a database attach because upgrading production servers (which are a mess) into the unknown sounds like a lot of weekends spent in the office.

With a database attach upgrade you backup a content database from your 2007 farm, move it to your new database, create a receiver web application on your target farm and then go into Powershell to mount the new database into the app you just made.  This grands away for a while, after which you have your old website on your new server.  Great!  It can even look the same, as you can pull the 2007 masterpages along, or you can elect to upgrade the ‘user experience’ during the database attach.

And then the Trouble Began

So this is all good (as long as you don’t have any custom code, solutions or files on the old farm that aren’t on the new farm) as long as you’re happy with the new farm where it is, as it is.

We have two 2007 farms, one for intranet and one for extranet.  We’re merging them and redesigning the main site, along with lots of other changes, so we’re moving sites and lists out of the upgraded web application into a new application/site collection.

Now, within a single site collection you can use the Content and Structure tool to move sites, lists and items about.  But if you want to move something between site collections, let alone web applications, it’s a bit trickier.

Powershell to the Rescue

If you can’t do Powershell, you can’t manage SharePoint 2010, it just can’t be done.  Now, there should be a command to move a site or list, right?  Something like Move-SPWeb or some such?

I’m afraid not.  You can get some fancy-pants software to do that but it costs an arm and a leg, one testicle and a handfull of teeth.  Particularly if you have a lot of material to move or lots of servers in your farm.  Plus you have to install proprietary APIs.

So you have to use Powershell, specifically, the Export-SPWeb and Import-SPWeb functions.

The Import/Export Obstacle Course

Here’s the first problem: you can’t just import into an empty URL.  You have to create a site on your target site collection, then perform the import.  OK, that’s not a problem, I’ll just create a blank site and import into that.

Experienced SharePoint administrators will immediately see a problem with that.  The site you create to import your material into has to be the same site template as the one you’re exporting from.  As a bonus, you can’t tell beforehand what template a site was made with (can anyone correct me on this?)  Luckily the import function will tell you what template you need in the error message.  God help you if your original site was some crazy template that doesn’t exist on your new farm.

The other irritant is that you end up with a bunch of blank and duplicate lists.  I hate clutter in SharePoint so it’s a fair bit of work to clean this up.

If my only Tool was  Hammer

Being a lazy soul, if I find myself doing something more than once I’ll look for a way to automate it.  It’s lucky I work with computers hey?  Since I’m punching all these commands into the Powershell console, I may as well just save them to a file.  That’s what’s done and it seems to work OK now.  I shall post the completed script in my next post.

 

 

SharePoint 2010 Scripted Install

I’m really getting into the scripted SharePoint 2010 install hosted at codeplex.  It’s great because whenever something goes wrong with the config and install process, I can roll the machine back and start clean, instead of carrying forward every bug.  This is very important to me as the 2007 instances that we support at the moment are what we developers call “a bloody mess”.

The project started on SharePoint 2003, got upgraded to 2007 and a bunch of custom web parts were made to duplicate the existing functionality.  It wouldn’t do to have an application that didn’t cost more, be less capable and have more bugs than the off-the-shelf product.

So this 2010 project is a fresh start and I wanted to make sure that this instance was rock-solid and not as spotty as some of my earlier attempts.

The trouble is that there’s more than one path to walk when running up a new instance.  At its most basic, SharePoint can be installed on your workstation, with the retarded SQL Server embedded.  At its most complex you have a least-privileges install, which is best-practice and  nice and secure.  Actually if you’re installing SharePoint to host lots of isolated customers (tenants) it’s more complex, but we don’t have to worry about that.

Once your binaries are installed, you’re given the option of running the config wizard.  The wizard is fine for your dev machine but will really mess up a production environment.  The trouble is that the wizard starts a bunch of services and so on that you can’t access via Central Admin, so to make a proper instance you need to get into Powershell.

Since we’re in Powershell and we’re probably going to fluff the first few goes at installing the instance one may as well script the steps.  Luckily, the folks over at AustoSPInstaller have already done all the hard work.

What the script will do is create a fairly typical SharePoint 2010 instance and, most importantly for me, configure and start the thorny User Profile Synchronisation (UPS) service.

One of the best things is that you can run the script many times without it breaking your existing instance.  Got an error when installing a service?  Fix it and run the script again.

I’ve made some extensions to help automate the config — the script will get you to the point where everything exists but you need to configure it to make it work in your environment.  What I’d like, ideally, is a script that would do the install and all the config, so that I have a known, properly configured instance that I can reproduce exactly and quickly.

I’ll write a bit more and give more detail in a future post.