July 2009 - Posts

Anyone else out there notice that if you add an generic handler to your project that it won’t compile?  It appears to be an issue with the template included in Visual Studio and it may have been resolved in a service pack but currently I don’t think so.  The file type I am describing is the .ashx generic handler and not the ASP.NET Handler as you can see below.

GenericHandler

After it creates the file you might immediately notice there are issues.  If you go to compile it, you’ll get errors like seen below.

GenericHandlerError

Luckily, resolving the error is quite easy.  You just need to add System.Web.Services to your using block.  Once you do that, you can compile successfully and begin building your handler.  I just thought I would point this out since I have seen this on multiple systems now.

I had a few questions about this recently, so I thought I would write a quick post on it.  It’s actually pretty simple to set up, but I know people like to see instructions with pictures.  It’s really not much different than setting up any other content source such as a file share.  There are a few obvious things you have to take care of though.  First, you need to make sure that the index server has access to the site it’s crawling.  That means if you are behind a firewall, or you need to access your public facing web site using a different URL, you need to take that into consideration.  We’ll talk about that more in minute.

To crawl your web site, you first go to your content sources page to create a new content source.  Give your content source a name and then select web site and type in the URL to the web site that you want to crawl.  When you choose this option, it follows every link it acts as a simple spider, following each link it can find and adding it to the index.  In this example, I am going to crawl DotNetMafia.com.

EnterpriseSearchWebSiteContentSource1

You also have the capability to set how many links the spider will follow when crawling and whether or not server hops are allowed.

EnterpriseSearchWebSiteContentSource2

After you configure your content source, you can start a crawl.  When it completes view the crawl log to see if there are any issues crawling your site.  This can also help you find broken links as well.

If you want to crawl a site that requires authentication that can be done as well by creating a crawl rule.  You can specify credentials with a crawl rule from a variety of sources, such as a certificate, cookie, or even FBA.  I don’t have an example handy today for that though so I’ll cover it in a future post.

As I mentioned earlier, if you have to specify a different name for a server internally than externally, that can be handled with a server name mapping.  A server name mapping allows you to map a URL that was crawled and replace it with a different URL (i.e. the external URL of the site).  Here is what that would look like.

EnterpriseSearchServerNameMapping

The last thing I will point out is that there isn’t a way to exclude a portion of the page from being included in the index (at least as far as I know).  What this means, is if you have a common navigation on every page, those words on it will show up as results on every page.  For example, if you have a link called Contact Us, all pages with the Contact Us link are going to show up as a hit in the search results.  Here’s an example of what I mean.  There are way too many results, which doesn’t help the user at all in this case.

EnterpriseSearchWebSiteResults1

As you can see, crawling web sites with Enterprise Search is pretty easy to set up.  You may have to deal with some issues like the one above, but it’s still not a bad solution.  This is a great way to index your public facing corporate site and bring those results into SharePoint.

I had the opportunity to present a couple of talks on Enterprise Search this weekend at SharePoint Saturday OzarksKyle Kelin and I drove out there on Friday and we attended a great speakers dinner.  Mark Rackley did a great job on running a smooth event and getting a great group of speakers together.  I got to meet a lot of people I have seen on twitter in person for the first time too.  I gave two talks at this event, but I also had a time to attend the talks of Mike Watson, John Ferringer / Sean McDonough, and Eric Schupps.

My first talk was an introduction to Enterprise Search.  The goal of the talk was to show you the basics of setting up content sources and crawls.  I wanted to show that it was pretty easy to get started setting up crawls for SharePoint sites, web sites, file shares, databases, web services, and people.

My second talk continued on Enterprise Search, but showed you some of the different ways you can query search results.  We covered keyword, Full Text SQL, and URL syntaxes.  I then demonstrated how to use the API and web services to query Enterprise Search.

Slides and code samples for both talks are included in the attachment at the bottom of this post.  If you have any questions, feel free to contact me.  Thanks again to everyone who attended and made the event possible.  It was a great event.

For more information on search, check out my Enterprise Search Roundup with links to previous posts I have written on the topic.

Follow me on twitter.

I’m not privy to the NDA, so I got to look at some of the new SharePoint 2010 (#sp2010) information for the first time with the Sneak Peak videos on the Microsoft site.  I’ll try not to just repeat information in the videos, but tell you what I am looking forward to and make comments.  I am sure everything they have stated so far is subject to change, but if half of it even gets implemented we’ll be in good shape.  As I was watching the admin video, the first thing that I noted is that there will be a logging database.  This appears to replace the need to find errors in the 12 hive’s LOGS folder.  This is very exciting and should make it much easier to track down problems.  Also of interest is that there is much improved export support and you will be able to backup specific sites and lists.  This one is a no-brainer and should have always been included to begin with. 

Another thing I saw was various things to help support large lists.  The first thing being an admin configurable threshold on allowing how many items can be displayed in the default view at a time.  The default was 5000 in the demo, so I am wondering if this will be the new suggested limit as opposed to the existing 3000 item limit today.  What is cool is that the interface will display all of the items for an administrator but it will notify a regular user that too many items have been returned and that they need to use a filter.

Visual Studio 2010 looks like it will be a great experience for developing SharePoint solutions.  Out-of-the-box there is built in support for editing all types of things in SharePoint including importing workflows created by SharePoint Designer and existing .wsp packages.  Building web parts no longer requires generating HTML via code due to the new Visual Web Part Designer.  It appears they created a new typed called VisualWebPartUserControl which inherits from UserControl.  You can drag and drop controls right onto the design surface and then easily deploy the web part with minimal effort.  There are lots of designers and tools for working with the features and solution package itself, but it appears it takes care of most everything for you while still allowing you to customize things when you need to.

The changes to the Business Data Catalog really excite me.  The BDC is now known as Business Connectivity Services (BCS).  Application definition files can be created easily with Visual Studio 2010 or SPD.  A visual design surface is available and allows you to easily create an application definition for an entity which comes from a database, web service, or .NET object.  It also now has true insert/update/delete support and will create the methods in your application definition so that you can use that functionality later in a list.  The new External List feature allows you to associate this application definition with a list and perform all of the CRUD operations on it just like it was a regular SharePoint list.  This will make it very easy to integrate external data into SharePoint.  My only question with this that comes to mind is there a way to customize the edit form or will that result in an unsupported scenario. 

Some other cool things about the BCS is that it integrates with Office.  Document templates can be created and data can be retrieved directly from the BCS to fill in values in a document.  Microsoft Groove has been renamed SharePoint Workspace and provides a graphical interface for working with SharePoint.  On top of that it provides the capability of syncing entire sites offline including BCS data.  Documents can be updated and LOB data can be changed and then it can be synced back to SharePoint.  I think this will provide great functionality for any type of field or remote users who are occasionally connected.

As a developer, another feature I was really excited about was LINQ support for SharePoint lists.  You will be able to point the spmetal tool at a SharePoint list to generate a strongly typed data context class.  You can then query the list as you would anything else using LINQ.  I took a look at the CTP of the Developer Documentation today as well.  It looks like all of the collections now have a method called GetTypeEnumerator(T).  This returns an IEnumerator<T> which means any collection that implements this can also be queried with LINQ.  I am really hoping this eliminates CAML queries, however I did see in the Client Object Model demo, that they still used a CAML query there.  Since I brought up the Client OM, I’ll mention that there is a client object model available now for SharePoint which will make things like integrating with Silverlight very easy.

Unfortunately, they haven’t really produced any information about the changes in Search yet.  I am looking forward to hearing most about those.  Hopefully, they completely scrapped the Search Center and started over.  I am wondering if there are any changes in the Records Center too.  I also wonder how this will affect partners that have built tools around MOSS 2007.  I can see Lightning Tools BDC Meta Man and AvePoint being directly affected.  However, I am sure there will still be new areas for these companies to explore with the new product.  This is an exciting time and I can’t wait for the public beta and to hear more about the product.  Unfortunately, it’s not looking like I will make it to the SharePoint Conference this year, so I will have to be getting a lot of information second hand.

On a related note, don’t forget about the Tulsa SharePoint Interest Group tonight.

Follow me on twitter.

I’ve seen quite a few posts in the forums about people wanting to not search the contents of documents.  These people for whatever reason were interested in just the title and path being searched.  Out of the box, this of course is not possible, but I got to thinking about how this functionality could be quite easily added to Wildcard Search.  With Release 5, I added a new property called DisableFullTextSearch.  Selecting this box, will cause the web part to only search the Title and Url fields.  It seems to work great.  I also fixed a few minor issues with default values in the web part in a few places.  If you don’t need this functionality, you can stay on your existing version.

I also took a look at the download numbers for all of the releases.  Here is where it stands as of today.

Version Downloads
Release 4 439
Release 3 1586
Release 2 397
Release 1 664

Not huge numbers, but I would say that is pretty decent for a SharePoint web part.  Of course I have no way of knowing how many people are truly using it, but I would like to this that this web part is really helping people out.  Although it doesn’t have all of the functionality that a commercial product does, it gets the job done and doesn’t cost the community anything to use.  This functionality should have been included to begin with which is why I am happy to give it to the community for their use.  Hope this release is useful to you and thanks for the continuing feedback.

Wildcard Search Release 5

I am pleased to announce I am speaking at SharePoint Saturday Ozarks in Harrison, AR on 7/18.  Mark Rackley is the man behind the event and he has lined up some great speakers.  Some of which I have seen before and some that I am looking forward to meeting.  I have two talked covering Enterprise Search at this event.  The first one, Introduction to Enterprise Search, will show new users of SharePoint what you can do out-of-the-box with Search.  The second talk, Interacting with Enterprise Search using Code, will show you how to the use the API and Web Services.  It should be a great event and it looks like there will be a #SharePint on Saturday night and maybe something less formal on Friday night if I have anything to do about it.  This will only be my second trip to Arkansas so I am looking forward to the event and meeting some new people.

Follow me on twitter.

I am sticking with my series of introductory Enterprise Search topics today by writing up some details on how to index a file share.  Setting up a file share index is pretty simple, but there are a few things to know, so that is the point of today’s post. 

The first step of indexing a file share is identifying your crawl account.  This is the account that will be used to index the file share (unless specified differently with a crawl rule) and therefore will need read access to the file share.  Start by granting read access on this account to any folder, subfolder, and file that you want indexed.  Any folder this account doesn’t have access to will be excluded.  If you are not too familiar with how permissions work on file shares, there are two places that an account must have permission: the Sharing tab and the Security tab.  You use the Security tab to grant access to an account on the file system itself.  This would be the same if that user is logged into that machine directly and trying to view the files.  The Sharing tab is what permissions the user has when accessing that folder over the network.  In order for an account to be able to read files over the network, the user must have read permission on both tabs.  Here is an example of what mine looks like for my crawl account MOSS_Setup.  Note: that screenshot is from Windows Server 2008.  Pervious versions looked a bit different.

EnterpriseSearchSharingTab

Security Tab with read access:

EnterpriseSearchSecurityTab

After you have configured permissions on your account, you need to go to the SSP –> Search Administration –> Content Sources.  Create a new content source and give it a name.  I called mine File Share in this case. Then you need to specify a start address.  You can specify the path as file://server/share or \\server\share.  Enter the path to one or more file share sand then save the content source.  You can also specify whether or not to index subfolders or not here. This is what my file share looks like. 

EnterpriseSearchFileShareContentSource

One thing to note before crawling is that, it will only index file types that you have allowed on the File Types page.  For example PDF is not included by default.  Add any extensions that you might need.  If you need to add any file types, specify the extension without the period (i.e.: pdf not .pdf).  You can also add file types programmatically.  This alone is enough to get it indexed, but if you want the contents of each file indexed, you will also need to install an appropriate IFilter for any new file type.

Once your file types are in order, you are ready to begin a full crawl.  After the crawl is completed, view the Crawl Log and verify that your files were indexed.  If there was a permissions problem or any other issues accessing the file share, you will see it here.  At this point you can go to your search center and try a search.  If all goes well, you should see some search results.  To see what got indexed, you can easily write a keyword query to show everything in the content source.  For example:

ContentSource:”File Share”

The results would look something like this.

EnterpriseSearchFileShareResults

As you can see it’s pretty simple to index file shares.  For more information on querying by content source, check out this post.

Follow me on twitter.