25 September 2009

Starting to program vtiger CRM

This article applies to vtiger 5.1.0 - some of the advice/code may not be accurate for later versions.


We've been investigating the vtiger CRM as a component in an open source set of tools that a business or organisation might need. A crucial idea is to link various tools together, so I have been looking at how to get information into and out of vtiger.

As a start, we're looking at an Outlook add-in to add an email address/name to the vtiger Contacts list, and how to export to an accounts package. There is already an Outlook vtiger add-in, but it seems pretty flaky and only does synchronisation; we want it to check a new email for existing matches and only add it if need be.

About vtiger

vtiger is a Customer Relationship Manager (CRM). vtiger deals with Leads who might turn into Contacts, with a Contact associated with an Account. You can set up Products with stock levels (and Services as well) and then generate Quotes and Invoices which can be downloaded as PDFs or sent by email. All that sort of thing. It is important to know that vtiger is designed for use only by a business' staff - it is not intended to be used by customers, although vtiger can check incoming support emails.

vtiger is extensible and there is some basic documentation to get you started making your own modules. The instructions effectively describe how to create a new object type (eg a time sheet), represented as a table or two in the vtiger database. You can link your module and its new object type into vtiger, so that eg menu [Tools][Time sheet] shows a list of time sheets in the standard vtiger style, clicking on a record shows the detail, with an Edit option available.

For the rest of this piece, I'm looking at adding functionality to an existing module. This will hopefully help you even if you are starting a complete module. Be careful if you alter existing code - it may well be overwritten if you upgrade vtiger.

The vtiger database

You will almost certainly want to have access to the database that vtiger uses, partly so you can see what's in there, but also so you see any changes as they really are. For MySQL, using phpMyAdmin will probably be the available tool - make sure you can get in to your database.

If there aren't any already, make some test Contacts within vtiger. In phpMyAdmin have a look at these tables:

  • vtiger_contactdetails
  • vtiger_contactscf
  • vtiger_contactaddress
  • vtiger_contactsubdetails

Each of these tables has a column named 'contactid' or similar as a primary or foreign key. Note that the "contact_no" column is what is displayed is vtiger as the "Contact Id", eg 'CON1'.

The table vtiger_modentity_num is used when generating the next "contact_no" value, for the appropriate module, so 2 might be next free Contact number to make 'CON2'. Later, we'll use vtiger CRMEntity class setModuleSeqNumber() function to get this value.

The 'contactid' numbers identifying the contact don't start at 0 or 1. In fact, vtiger assigns a unique id number across all objects, so object 792 might be a Contact and 793 might be an Invoice. These details numbers are stored in the vtiger_crmentity table - a deleted flag in there is set when items are first deleted - so they can recovered from the recycle bin. Later, we'll use the insertIntoCrmEntity() function to get a new object id.

vtiger command structure

All vtiger accesses come through one root file index.php. The parameters determine which module operation is carried out. For example, consider this URL

/index.php?action=DetailView&module=Invoice&record=800&parenttab=Inventory

It runs the code DetailView.php in the directory /modules/Invoice/. The code that runs in there, finds the "record" value and shows the relevant invoice. So, you can add to the "Detail View" functionality by adding code to this file.

You can create your own actions by making a new file in the relevant directory. You then need to provide a means of invoking that action - more later.

Smarty templates

vtiger uses the Smarty template system to generate its output. By and large, the vtiger code doesn't generate any HTML - instead, it loads up a template from TPL files in the /Smarty/templates/ directory. vtiger PHP code sets various values that are combined into the template to form the final output. Smarty has an extensive control language available for use in a template, so it can cope with complicated structures, eg it can iterate through arrays to generate multiple table rows. One Smarty template can invoke another, so finding where something is generated can be tricky.

Adding a tool option to an Invoice

If you view an Invoice there are Tools listed on the right hand side: "Export to PDF" and "Send Email with PDF". To add another option here, you have to drill down through the tpl files: DetailView.php invokes template Inventory/InventoryDetailView.tpl which includes template Inventory/InventoryActions.tpl.

In InventoryActions.tpl, after this line:

<!-- To display the Export To PDF link for PO, SO, Quotes and Invoice - ends -->

add in this code:

{if $MODULE eq 'Invoice'}
{assign var=send_accounts_action value="SendAccounts"}
<tr><td align="left" style="padding-left:10px;">
<a href="index.php?module={$MODULE}&action={$send_accounts_action}&return_module={$MODULE}&return_action=DetailView&record={$ID}&return_id={$ID}" class="webMnu"><img src="{'actionGenerateInvoice.gif'@vtiger_imageurl:$THEME}" hspace="5" align="absmiddle" border="0"/></a>
<a href="index.php?module={$MODULE}&action={$send_accounts_action}&return_module={$MODULE}&return_action=DetailView2&record={$ID}&return_id={$ID}" class="webMnu">Send to Accounts</a>
</td>
</tr>
{/if}

Upload InventoryActions.tpl and view an Invoice - check that "Send to Accounts" is visible on the right.

Adding to accounts

Now, create a file SendAccounts.php - probably copying an existing file such as DetailView.php makes sense.

OK - this bit is skimpy for now, but it might get you started. Near the end, add in the following code to query the vtiger database and display a little info from each record found. To get the list of products ordered you will also have to query the vtiger_inventoryproductrel table.

$query="select * from vtiger_invoice where invoiceid=?";
$invoice_id = $focus->id;
$result = $adb->pquery($query, array($invoice_id));
$noofrows = $adb->num_rows($result);
$final_output = "";
while ($inv_info = $adb->fetch_array($result))
{
$invoice_no = $inv_info['invoice_no'];
$subject = $inv_info['subject'];
$invoicedate = $inv_info['invoicedate'];
$accountid = $inv_info['accountid'];
$total = $inv_info['total'];
$final_output .= "$invoice_no: $subject $invoicedate $accountid $total<br/>";
}
$smarty->assign("SEND_DETAILS", $final_output);
$smarty->display("Inventory/SendToAccounts.tpl");

At the end of our new code, the Smarty variable "SEND_DETAILS" is set to the string we want to display. Then the Smarty template SendToAccounts.tpl is run.

You had best create the template file SendToAccounts.tpl initially as a copy of an existing template, eg InventoryDetailView.tpl. You will need to pare it down - and eventually add in a line like the following to display your output:

<p><b>Storing data</b>: {$SEND_DETAILS}</p>

OK - all we've done so far is display the invoice details. Actually talking to an accounts system is beyond the scope of this article.

Adding a new contact

As I said earlier, I want to have an Outlook add-in that can be used to add a contact to the vtiger database, after checking to see whether the contact already exists. As the first step to getting this done, I set up a small web form that has First name, Last name and Email fields. The Go button is set up to go to this vtiger URL eg:

/index.php?action=AddOutlookContact&module=Contacts&FirstName=Chris&LastName=Cant&Email=sales@phdcc.com

As per usual, vtiger uses its standard rules so that this script is called:
/modules/Contacts/AddOutlookContact.php

I made AddOutlookContact.php in a similar way to that described above, using the title "Import from Outlook" for the main content tab. The code generates the output HTML in the PHP script - a neater final approach would put the raw HTML within the template. Anyway, the PHP code generates its HTML in $Output and then invokes my new template AddOutlookContact.tpl:

$smarty->assign("IMPORT_DETAILS", $Output);
$smarty->display("AddOutlookContact.tpl");

In the guts of AddOutlookContact.tpl, the output is reproduced:

<div>{$IMPORT_DETAILS}</div>

The tasks within AddOutlookContact.php are:

  • Get passed parameters
  • Check parameters
  • Find any matching names or emails
  • Show matches to user
  • Show Add Contact button

So: the user can see any existing matches, and can decide whether or not to press the Add Contact button, which goes to another script AddOutlookContact2.php.

The SQL script to query the existing contacts is as follows:

SELECT c.*, a.accountname, crm.deleted FROM vtiger_contactdetails AS c
LEFT JOIN vtiger_account AS a USING (accountid)
INNER JOIN vtiger_crmentity AS crm ON crm.crmid=c.contactid
WHERE lastname LIKE '%$QuotedLastName%' OR email LIKE '%$QuotedEmail%'";

Note that I also use the function mysql_real_escape_string() to ensure that the passed strings cope with single and double quotes etc.

The SQL links vtiger_contactdetails with the vtiger_crmentity table so the Deleted column can be picked up. The vtiger_account table is also included so any associated account name can be found.

The code goes through all the found matches and displays the results to the user. I provided a link to each contact - this is the code for the link:

$MatchLink = "index.php?action=DetailView&module=Contacts&record=$Match_ContactId&parenttab=Support";

The following form shows a suitable "Add new contact" button:

<form action='index.php' method='post' onsubmit='VtigerJS_DialogBox.block();'>
<input type='hidden' value='AddOutlookContact2' name='action' />
<input type='hidden' value='Contacts' name='module' />
<input type='hidden' value='$htmlLastName' name='LastName' />
<input type='hidden' value='$htmlFirstName' name='FirstName' />
<input type='hidden' value='$htmlEmail' name='Email' />
<input type='submit' value='Add new contact' />
</form>

AddOutlookContact2.php has a similar core as before, except it generates HTML that shows what it has done. Here's what I did to add a new contact. This is probably not the recommended vtiger method, but it seems to get the job done.

$adb->startTransaction();
$NewContactNo = $focus->setModuleSeqNumber("increment",'Contacts'); // updates vtiger_modentity_num
$focus->insertIntoCrmEntity('Contacts'); // updates vtiger_crmentity and sets $focus->id

$query = "insert into vtiger_contactdetails (contactid,contact_no,firstname,lastname,email) values(?,?,?,?,?)";
$qparams = array($focus->id, $NewContactNo, $FirstName, $LastName, $Email);
$adb->pquery($query, $qparams);

$query = "insert into vtiger_contactscf (contactid) values(?)";
$qparams = array($focus->id);
$adb->pquery($query, $qparams);

$query = "insert into vtiger_contactaddress (contactaddressid) values(?)";
$qparams = array($focus->id);
$adb->pquery($query, $qparams);

$query = "insert into vtiger_contactsubdetails (contactsubscriptionid) values(?)";
$qparams = array($focus->id);
$adb->pquery($query, $qparams);

$adb->completeTransaction();

Here are the steps:

  • Call setModuleSeqNumber() to get the next contact no eg CON2
  • Call insertIntoCrmEntity() to get the next contact id, eg 794
  • Insert the basic details into vtiger_contactdetails
  • Insert the contact id into vtiger_contactscf
  • Insert basic rows into vtiger_contactaddress and vtiger_contactsubdetails

The output provides a suitable link to the new Contact.

Later, I made a real Outlook 2007 addin to call this vtiger code. An extra button is shown on the ribbon when you read an email. When you click the button, it picks up the sender's name and email address and shows the correct URL with parameters in the user's default browser.

26 August 2009

DNN no container span causes validation error

If you turn off the DNN container for a module, then DNN by default inserts an extra SPAN that can cause a W3C validation error. One solution is to have a minimal container.

I had turned off the "Display Container" option for some modules so that no title is displayed. (In Admin Edit mode, the default container is used so that you can get at settings etc.) The HTML produced by DNN looked like this, within the skin pane:
<a name="402"></a>/
<span id="dnn_ctr402_ContentPane" class="DNNAlignleft">
<!-- Start_Module_402 -->
<div id="dnn_ctr402_ModuleContent">

and then the module content.

This causes a validation error because you cannot nest a DIV within a SPAN.

My solution is not to use the "Display Container" option - instead, I created a container with no title and used that instead. In the replacement container, I moved the [ACTIONS] and [ICON] into the footer. This NoTitleContainer container files are available here: http://www.phdcc.com/download/DNN/NoTitleContainer.zip. Unzip in /Portals/_default/Containers/MinimalExtropy. I'm not sure if you need to go to [Admin][Skins] to make the container visible.

25 June 2009

DNN Event Viewer spider errors

If you are getting "Page Load Exception" errors in the DotNetNuke (DNN) [Admin][Event Viewer] with the UserAgent indicating a web crawling spider, then you need this fix:

The fix: Get the Browser Caps - v2 by Oliver Hine. Unzip the file App_Browsers.zip to get the file OtherSpiders.browser - put this in the App_Browsers directory of your DNN site. Note that your DNN site will automatically restart, so perhaps do this at a quiet time.

Alternatively (if you want), you should be able to stop these spiders accessing your site using a suitable robots.txt file.

This patch is needed for DNN 4.9.4 and possibly earlier. I think that it may be fixed in DNN 5.1+.

Spiders fixed: This file has fixes for these spiders: Baiduspider, Yandex, ia_archiver, Sphere, Feedfetcher-Google, Yanga, worio, zibber, twiceler, voliabot, aisearchbot, robotgenius and R6_FeedFetcher.

Spiders not fixed: However it doesn't fix TweetmemeBot - I failed to add support for this. Read this Tweetmeme.

Update: doesn't seem to work for Twiceler or voilabot.

What's happening: DNN code is asking ASP.NET for details of the browser capabilities. ASP.NET gets this from the UserAgent string. DNN asks for the major and minor version numbers of the browser and causes an exception if these cannot be determined from the UserAgent. ASP.NET uses various .browser files (in the framework directories and in the web app's App_Browsers directory) to work out what browsers can do. As the UserAgent provided by these spiders doesn't follow a standard regex pattern, ASP.NET cannot get the major and minor versions.

Thanks to: Barry Sweeney for his prompt help on Twitter - and Oliver Hine of course.

Below is the DNN exception that I get before applying this fix:





Message: DotNetNuke.Services.Exceptions.PageLoadException: Value cannot be null. Parameter name: String ---> System.ArgumentNullException: Value cannot be null. Parameter name: String at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info) at System.Web.Configuration.HttpCapabilitiesBase.get_MajorVersion() at DotNetNuke.UI.Utilities.ClientAPI.BrowserSupportsFunctionality(ClientFunctionality eFunctionality) at DotNetNuke.UI.Utilities.ClientAPI.get_ROW_DELIMITER() at DotNetNuke.UI.Utilities.ClientAPI.RegisterClientVariable(Page objPage, String strVar, String strValue, Boolean blnOverwrite) at DotNetNuke.UI.Skins.Controls.Search.Page_PreRender(Object sender, EventArgs e) at System.Web.UI.Control.OnPreRender(EventArgs e) at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Control.PreRenderRecursiveInternal() at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) --- End of inner exception stack trace ---

27 May 2009

DNN Skin token support in a module

This article shows how to get a DotNetNuke (DNN) module to support tokens in skins. It will use, as an example, how I added support for the [CODEMODULE] skin token in our phdcc.CodeModule retail module.

Introduction: skins are used by DNN as a template for the whole site output, and a container template is used to wrap an individual module. Skins and containers can contain standard tokens such as [LOGO] - this is replaced at runtime by the logo image chosen by the site administrator.

Doing it manually

The crucial trick is to add a suitable entry to the ModuleControls table.

This table primarily contains a mapping between the blank (View), Edit and Settings control keys for a module and the actual controls that implement them, eg the "Settings" ControlKey is set to "DesktopModules/phdcc.CodeModule/Settings.ascx" for my module.

To support your new skin token, add a new row with your skin token name in the ControlKey column (eg "CodeModule" with no quotes or brackets), a NULL for the ControlTitle, your view control in ControlSrc and -2 in ControlType. For my module, the ControlSrc is "DesktopModules/phdcc.CodeModule/View.ascx"

Adding a token to a skin

People work with skins in various ways. The standard method is to have an HTML template file that is parsed into an ASCX when the skin is uploaded. If you edit the HTML file on a live site then you can click on [Parse Skin Package] to re-make the ASCX.

Anyway, add your skin token to your HTML skin somewhere, eg add "[CODEMODULE]", without quotes but with brackets - and in upper case. Either package it up and upload, or edit the live version and click on [Parse Skin Package]. The parse log should show your token being replaced. The ASCX should have your token replaced, as per the example below.

If you work with an ASCX template file, then you need to add a suitable Register directive at the top of the ASCX, eg:
<%@ Register TagPrefix="dnn" TagName="CODEMODULE" Src="~/DesktopModules/phdcc.CodeModule/View.ascx" %>
then add instances of this control later on, eg:
<dnn:codemodule runat="server" id="dnnCODEMODULE" />

Hopefully your view control should now be called whenever this skin is used. If so, hurrah! Note that the PortalModuleBase TabModuleId is set to -1 so you could use this to tell when you are being called from a skin.

Token parameters

It is useful to be able to pass parameters to your module, eg so it knows what to display. The parameters are set in the skin.xml file alongside your HTML template. For my module, I wanted to add a ControlFile parameter to tell my module what to do. I added this to skin.xml, after <Objects>

<object>
  <token>[CODEMODULE]</token>
  <settings>
    <setting>
      <name>ControlFile</name>
      <value>viewSkin.ascx</value>
    </setting>
  </settings>
</object>

You can specify several settings for your token. In this case, I set the "ControlFile" setting to value "viewSkin.ascx". Make sure [CODEMODULE] is in upper case.

To support each parameter, you must add a corresponding property to your control. So, in this case, I must add a property called "ControlFile". In VB, this could be:

Private _ControlFile As String

Public Property ControlFile() As String
  Get
    Return _ControlFile
  End Get
  Set(ByVal value As String)
    _ControlFile = value
  End Set
End Property


If you upload or re-parse your skin, you should find that the parameter has now been set in the ASCX, eg:
<dnn:codemodule runat="server" id="dnnCODEMODULE" ControlFile="viewSkin.ascx" />

When your view control is now run, the ControlFile property should have been set before your Page_Load is called. You can use this to tell when you have been called from a skin - and act accordingly.

Automatic installation

Earlier on, we added the crucial row to the ModuleControls table by hand. To set this automatically during installation, you need to add an appropriate data provider. For example, this might be called 01.00.00.SqlDataProvider. Add in SQL code to delete any existing ControlKey and add the correct one, eg:
DELETE FROM {databaseOwner}{objectQualifier}ModuleControls
  WHERE ControlKey='CodeModule'
INSERT INTO {databaseOwner}{objectQualifier}ModuleControls
  (ControlKey,ControlTitle,ControlSrc,ControlType)
  VALUES ('CodeModule',NULL,'DesktopModules/phdcc.CodeModule/View.ascx',-2)


Finally, add in an entry to your Uninstall.SqlDataProvider file to remove this entry if your module is removed:
DELETE FROM {databaseOwner}{objectQualifier}ModuleControls
  WHERE ControlKey='CodeModule'

14 April 2009

DNN development and production sites

Bro John wants me to write up how he does DotNetNuke (DNN) upgrades and site copies. As a preamble, he wrote this:
=============
I would think anyone with a production site would want three copies
A - a live site
B - a transition site
C - a development site

also
D - a trash site

As most of my stuff is now in phdcc.CodeModules, I develop these until they work on C, I copy these to D to see if they still work in a different environ. I then copy them to A and test.

I would prefer to have a B which means I can shut down A and have users running on B and then swap back to A later if for example I do a DNN version upgrade.

Doing a DNN version upgrade on a live site is just asking for trouble.

If this sort of stuff is not sorted in DNN properly then I wouldn't consider it a solid environment to do anything serious in.

I am trying to be careful about keeping all database/email server specific stuff in a few files. Hard links to other pages on the site are OK if you copy the entire site correctly.

Maybe bigger players have other tricks they pull. Maybe they swap DNS pointers. However that takes time to permeate and produces horrid cahcing problems.

03 April 2009

Content-Location HTTP header for current URL

All our web sites on shared host provider Crystaltech/Newtek have just started serving an extra HTTP header Content-Location for all static web page requests - the header contains the current URL.

To see this in action, use Rex Swain's HTTP Viewer to view this page at our web site: http://www.phdcc.com/phd.html.

If you look in the received output, you will see that the Content-Location header is set to the requested URL:
Content-Location:·http://www.phdcc.com/phd.html(CR)(LF)

Normally, the Content-Location header is used to indicate when the content actually corresponds to another URL. So if you look at http://www.phdcc.com/ you will see this output:
Content-Location:·http://www.phdcc.com/default.htm(CR)(LF)
ie the web site home page is actually called default.htm.

While this is a harmless change for most users, it did in fact fool our FindinSite-MS search engine - I am in the process of releasing a new version to cope with this. Our ISP Crystaltech claims that nothing has changed. Has anyone any further information?

02 April 2009

Lowering ASP.NET memory usage

This blog post is a work in progress on how to keep ASP.NET web application memory usage low. The motivation for this is to avoid the web app being stopped for using too much memory.

Background

The Microsoft Internet Information Services (IIS) web server has administrator options to Recycle Worker Threads, ie to automatically stop a thread that runs an ASP.NET web application if a threshold is passed. Some of these thresholds are 'arbitrary', eg the default Elapsed Time of 29 hours. However the Virtual Memory and Used Memory thresholds tend to kick in as you use more memory. If a web app is 'recycled' then you get no warning - the thread is simply terminated; the web app is only restarted in response to another web request.

In addition, a web app will tend to slow down as its memory use increases.

I am working on this topic for my FindinSite-MS site search engine. This one web app both (a) does searches and (b) crawls a web site to build a 'search database' that is used by the search. The crawl/index task is done in a separate background thread - an indexing task is either started from the user interface, or run on a predefined schedule. (The app has to be alive for a scheduled index to run - so an outside process can be set up to wake up a web app if need be.)

This software currently suffers from two problems:
- Too much of the database-being-searched is kept in memory
- More significantly: The index process uses a large amount of memory.

Note that my 'search database' db (as I refer to it) is just a set of files in memory - no real database is used. This makes the search engine easier to deploy as a database is not required.

Memory heuristics

I use the System.GC.GetTotalMemory(false) call to get the current total memory usage, without forcing a garbage collection.

I don't have a precise figure for what amount of memory is too big. On our Crystaltech shared host, anything less than 10MB is good, while anything over 100MB is bad - though a recent indexing run worked with a maximum 400MB memory usage.

Redesign methodology

My initial focus was on designing a new 'search database'. This db design needs simultaneously to be searchable and buildable, ie sufficently fast and low-memory while searching - and the same while building.

The main obvious programming technique is not to keep anything in memory. This is actually quite a hard mindset to achieve. For example, it would be quite nice to keep a bit of information in memory about every file indexed, eg URL, title, size, etc. While this might work for 1,000 files, or even 25,000, it is not going to work for half a million.

Using disk instead of memory

If I cannot store information in memory, then I'll have to save it to disk. In fact, this may not be as bad an option as it sounds, as the operating system (and hardware etc) will cache disk data in memory, so performance time may not go down too significantly. Storing data on disk should not impact on ASP.NET memory usage, so should help ensure that my app isn't killed.

Example 1: word file list

I want to store a list of file numbers for each word found during the crawl. My first re-design had 32 of these numbers in memory, along with 4 other integers - with the rest on disk. A second redesign reduced this to 8+4. My latest design simply has 2 integers per word in memory. Everything else is on disk.

The first integer is the block number in the temporary data file. The second number is the last inserted file number - this makes sure that I don't update the block more than once per file for each word.

Example 2: file list

I want to know whether I've indexed a file before. I used to keep 2 (yes two) lists of files in memory. Now this is all written out to disk.

So how do I check quickly if I've already indexed a file? OK, I do have a List<> of all files indexed. Each List element is a structure that contains two integers, a FileHash and a FilePointer. The FileHash is the HashCode of the file path, and the FilePointer is the location of the full information on disk.

To check whether I've indexed a file, I find the HashCode of the file path. I then iterate through the List<>. If the hash matches then I use the FilePointer to retrieve the path from disk. If this matches, then the file has been indexed before. I keep looking if it doesn't match, in case two or more files have the same hash.

Example 3: reversed word list

To support wild cards at the start of a search, eg a search for [*hris], I need to reverse each word, so that "chris" becomes "sirhc". [I won't reveal my algorithm just now.]

At one stage, I created a list of reversed words as I went through the normal word list. However, I now write the reversed words out to disk first. I then clear the word list (see below) to reduce my memory footprint. Finally I read my reversed words in for processing.

Example 4: SortedDictionary.Clear()

I use a SortedDictionary during the indexing run. If I set this to null and do a garbage collect, then no memory is freed. If I call Clear() and then set this to null, the memory is cleared. [And I'm pretty certain that there are no other references to items in the dictionary.]

Should I use Cache?

Is it safe to use the ASP.NET Page.Cache? I presume that using this will still add to the memory used by the application, so the web app could still be shut down unilaterally, without trying to clear the Cache.

I do use the Cache as a part of the search process. However I set an expiry time of 5 minutes - this provides a useful cache while a user is searching, but clears the memory when they have probably gone anyway.

Storing data in the Session

I have just remembered that I store the search results for each user in a Session variable. This is useful, eg when they ask for the second page of hits for a search. This results data could be reasonably big, so I now think that this is not wise. The Session variable will presumably be cleared after say 20 minutes, but this is too big a risk. I'll have to store the results to disk, retrieve and clear as necessary.

Large Object Heap objects

This article on The Dangers of the Large Object Heap says that any object larger than 85kB (or 8kB for doubles) might result in increased memory usage. Try not to use large objects.
Does each collection or generic collection count as one object, or is each one a multitude of its individual components?

Current progress

A crawl of a 100,000 simple HTML files now takes 11 minutes and uses a maximum of 23MB of memory. When this db is loaded for searching, the rest state of the web app is 6MB. After a couple of searches this went up to 22MB.

The previous version took 83 minutes and used 126MB max memory. When loaded for searching, 44MB of memory is used, going up to 50MB after a couple of searches.

This is not exactly a completely fair test as the new code is not complete in several important ways. However it does show dramatic memory usage reductions. I am not exactly sure why there's such a dramatic speed improvement - it might be because of the reduced memory usage - or it could be the simpler not-really-complete algorithm.

29 March 2009

DNN portal Import/Export Link Url bug

In DNN DotNetNuke 4.92, you can use [Host][Portals] to export a complete Portal as a template to the Portals/_default/ directory. You can then import the template elsewhere using [Admin][Site Wizard].

Summary: if you have pages that have a "Link Url" that refers to another page on the site, then these links do not survive the import/export process properly. Currently, you will need to fix these up by hand after the import, going through [Admin][Pages] to edit the selected page's settings.

Preamble: in DNN, each page is identified by an integer TabId. This nomenclature stems from DNN's initial versions that refered to each page as a "tab".

Details: When a "Link Url" is set up to another page on the site, the page's TabId is stored as an integer in the Tabs database table Url column. When the portal is exported, this Url TabId number is stored in the template (in a <url> XML element). On import, the Url TabId is simply stored directly in the new page row Url column. However, the TabId of the desired "Link Url" page will almost certainly have changed, so the page will redirect the user to the wrong page.

Background: The Site Wizard does not really describe what it does very well. An old book says that the import process is *additive*, ie the pages in the template are *added* to the existing pages on your site. However this isn't entirely correct: if you select "Replace" in the wizard, then the existing pages are soft-deleted, ie their TabName has "_old" appended and the page is marked as Deleted.

The import Site Wizard assigns a new TabId to each imported page (if a new page is created). This means that it is virtually certain that the TabId-s have changed. If your template etc has hard-coded links to particular TabId-s, then these are almost guaranteed to be wrong.

It would be useful if it were possible to import a template and keep the original TabId-s. Currently, the portal export does not contain the current TabId for each page. Superficially it is also not possible to keep existing TabId-s as the Tab table TabID column is an identity. However it is possible to use the SET IDENTITY_INSERT command to insert a row with a particular TabId.

Reported on DNN Gemini bug tracker

To do:
- See if there is a better way of putting links (eg in Text/HTML and in code) so that they survive the import/export process. Me: in code you can do this using the TabController GetTabByName function.
- Double-check what the Ignore and Merge site wizard options do

Bro John adds:
- These problems typically cause an infinite redirect loop and that IE does not notice this but FF and Chrome do. Me: The problem in one case is that the Login page was not visible to anonymous users, so it redirects to the Login page ad mausoleum. The import process does not reset the LoginTabId; it only sets it if a tab with tabtype logintab is found. Unless the old LoginTabId is Replaced, this should be safe.
- The logon menu item typically fails so you need to have added a hand made login page to be able to log in to the site once you have created it ...

28 January 2009

Environment.GetCommandLineArgs backslash and double-quote issue

Just noticed that System.Environment.GetCommandLineArgs() handles command line arguments that contain a backslash (\) followed by a double-quote in a strange way.

Microsoft documentation says:
"If a double quotation mark follows two or an even number of backslashes, each proceeding backslash pair is replaced with one backslash and the double quotation mark is removed. If a double quotation mark follows an odd number of backslashes, including just one, each preceding pair is replaced with one backslash and the remaining backslash is removed; however, in this case the double quotation mark is not removed."

This becomes an issue when you use double-quotes to delimit parameters that may contain spaces. If a parameter is a directory such as C:\test\ then it will be quoted as "C:\test\" but it will be returned by GetCommandLineArgs() as C:\test" ie with a double-quote instead of a backslash.

In my case, I was able to add a space between the \ and the double-quote. Then Trim() in the receiving app.

The full command line is available in the Environment.CommandLine property.

Also: a command line argument such as a"b"c is returned as abc by GetCommandLineArgs().

20 January 2009

Custom VS Web Setup Projects

A Web Setup project installs an ASP.NET web application on a Microsoft Internet Information Services (IIS) web server. The simplest web application might consist of Default.aspx and Default.aspx.cs files; a simple web setup project would install these files into a directory of the installer's choice and mark that directory as an IIS application, eg so that the web app runs at http://localhost/installdir/

The Visual Studio 2008 Web Setup project looked like a good starting point for distributing a new version of my web application (with web service and a Forms app that tests the web service). I just needed to customise it to:
  • Add links to the web app in the Start Menu (eg to http://localhost/installdir/default.aspx)
  • Add links to the web app on the install wizard finished page
  • Set NTFS write permissions for the web app directory
  • [Later] Make the MSI install work in Vista

Little did I know what a task I had set myself. Perhaps buying an installation tool might have been worthwhile... Follow me, as I battle to achieve these modest aims.

My software is aimed at individual users and to run on servers. Running as an ASP.NET web app, it could fulfil both target systems. I want to have Start Menu links so that individuals have a means to access the web site easily. The Start Menu would also have a shortcut to a Windows Forms exe that would test the web servce.

Basic set up: Files, Program Files and Start Menu

First, make sure that you are targetting the right version of ASP.NET - you can choose this at the top right of the VS "Add New Project" dialog. Once your project is created, look at the project properties. Click on [Prerequisites] and make sure that your chosen base version is selected. I selected ".NET Framework 2.0 (x86)" and "Windows Installer 3.1".

Next, add the project output from your web app project. View the File System and add in your aspx pages (but not .aspx.cs), Global.asax and release Web.Config files, together with any other files you want in the web app directory. Make sure that the bin directory has all the needed assembly DLLs.

My Windows Forms application had best go in the Program Files directory, so I added a "Program Files Folder" special folder to the setup File System. I created a "Product" sub-folder for my software, and added the Forms app output to this folder. [Later on, I also put my custom installer to this folder.]

I added a "User's Start Menu" special folder to the setup File System. I added "Programs" then another "Product" folder inside. I want to add my URL links to the web app in here, but I had to do that in my custom installer. However I did create a shortcut to the Forms app. I had wanted to pass the actual web app URL as a parameter to this app, but this proved not to be possible, so I amended the app to get the URL from a file.

MSI Database overview

The web setup project creates a Windows Installer MSI file, and an accompanying Setup.exe. An MSI file has a real relational database inside that governs how the install proceeds. The database has a series of tables each with rows with various columns.

For example, the Dialog table has one row to define each dialog that can appear during the install. The Control table has rows that define each user interface element in the dialogs.

You can see what's in the MSI database using the orca tool, available in the Windows SDK. Later on, I'll use orca to build a Transform that changes the main text of the finished dialog box. The msitran tool can then be used as the setup PostBuildEvent to run this transform after every project rebuild.

The install process maintains various properties, some of which are defined in the Property table. Other properties are set at install time - these are the ones I am interested in:

  1. [TARGETVDIR]: The chosen virtual directory, eg Product
  2. [TARGETDIR]: The consequent web app path, eg C:\inetpub\wwwroot\Product
  3. [StartMenuFolder]: The Start Menu, eg C:\Documents and Settings\All Users\Start Menu

NB: There are various Win32 functions that let you work with the MSI database, such as MsiOpenDatabase().

Custom Action Installer NET assembly

You can achieve some things in a .NET custom action assembly, but your code is quite isolated from the main install. I don't think you can use/edit the current MSI database, nor can you interact with any of the setup dialogs.

To create a custom installer, create a C# class library project - give it a useful name. Then delete the initial class1.cs. Add a new item "Installer Class" and give the file and class a meaningful name. Switch to code view. You will see that your class inherits from System.Configuration.Install.

There are various overrides that you can add. You will definitely want to override Install() and possibly Commit(). You will almost certainly also want to override Rollback() and Uninstall() - to undo any changes that you have done. In fact, you must override and call Install() even if you only want to work in the other methods.

For each override that you add, add a call to the base class at the start of your implementation, eg:

public override void Install(IDictionary stateSaver)
{
  base.Install(stateSaver);
}

Compile this project and add the output to the web setup project File System Program Files "Product" sub-folder.

Note: if an exception is thrown in your Install() method, the install will rollback.

Web setup Custom Actions

In the web setup project, view the Custom Actions screen. Add a Custom Action for each of the types that you wish to support. Make sure that the InstallerClass property is True. For the Install action, set the CustomActionData as follows:


/TargetDir="[TARGETDIR]\" /TargetVDir="[TARGETVDIR]" /StartMenuFolder="[StartMenuFolder]\"

This sets three parameters that are available to your custom installer assembly. Note the double quotes, and that the backslashes that seem to mandatory in two cases.

Installer custom action

In your Installer() code, use the base class Context.Parameters StringDictionary to retrieve the values passed in the CustomActionData:

string StartMenuFolder = Context.Parameters["StartMenuFolder"];
string TargetDir = Context.Parameters["TargetDir"];
string TargetVDir = Context.Parameters["TargetVDir"];

Getting the web app URL

The URL of the installed web app will probably be http://localhost/product/ where product is the user's chosen virtual directory, as supplied in TargetVDir. However I want to try to check this. This code gets the path that should correspond to http://localhost/:

DirectoryEntry localhost1 = new DirectoryEntry(@"IIS://localhost/W3SVC/1/Root");
string localhostRoot = localhost1.Properties["Path"][0].ToString();

If this directory is C:\inetpub\wwwroot and the TargetDir is C:\inetpub\wwwroot\Product\ then I think we can be fairly certain that appending TargetVDir to http://localhost/ will give the correct URL of your installed web app. This code sets MakeLocalhostLinks to true if we can safely create URL shortcuts:

bool MakeLocalhostLinks = true;
if ((String.Compare(localhostRoot, TargetDir.Substring(0, localhostRoot.Length), true) != 0) ||
  (String.Compare(@"\" + TargetVDir + @"\", TargetDir.Substring(localhostRoot.Length), true) != 0))
{
  MessageBox.Show("Could not create Start Menu links", "Installer");
  MakeLocalhostLinks = false;
}

Add URL links to the Start Menu

I can now find our Start Menu folder that has just been created and add links there. A link is a text file with extension .url that has one line containing [InternetShortcut] and the next specifying the link after URL=

The following code does this job, creating a link to begin.aspx on the web app site, as well as creating a fixed link to the product documentation. In both cases, the path to the newly created link is saved in the stateSaver IDictionary, for use later on rollback or uninstall.

string UserStartMenuFolder = StartMenuFolder + @"Programs\product\";
if (Directory.Exists(UserStartMenuFolder))
{
  if (MakeLocalhostLinks)
  {
    string BeginLinkPath = UserStartMenuFolder + @"Begin.url";
    using (StreamWriter twBeginLink = new StreamWriter(BeginLinkPath))
    {
      twBeginLink.WriteLine("[InternetShortcut]");
      twBeginLink.WriteLine("URL=http://localhost/" + TargetVDir + "/begin.aspx");
    stateSaver.Add("BeginLinkPath", BeginLinkPath);
    }
  }
  string InfoLinkPath = UserStartMenuFolder + @"Product documentation.url";
  using (StreamWriter twInfoLink = new StreamWriter(InfoLinkPath))
  {
    twInfoLink.WriteLine("[InternetShortcut]");
    twInfoLink.WriteLine("URL=http://www.example.com/product/");
    stateSaver.Add("InfoLinkPath", InfoLinkPath);
  }
}

Commit

Commit is called when the install is complete. You cannot make changes to the stateSaver during Commit()

Commit() could be a good time to present some extra choices to the user, start an application or show a web page. Note that the installer waits for your method to return before continuing. You could spawn another process if you want to leave a dialog box open.

Rollback and Uninstall

If anything goes wrong with the install, or an uninstall occurs, you need to undo any changes you have made. In Rollback() and Uninstall() you have access to the values that were saved in stateServer. Both Rollback() and Uninstall() call my own new method RemoveCustomAdditions(). This gets the saved path strings; if they are present then the file is deleted.

public override void Rollback(IDictionary stateSaver)
{
  base.Rollback(stateSaver);
  RemoveCustomAdditions(stateSaver);
}
public override void Uninstall(IDictionary stateSaver)
{
  base.Uninstall(stateSaver);
  RemoveCustomAdditions(stateSaver);
}
private void RemoveCustomAdditions(IDictionary stateSaver)
{
  try
  {
    string BeginLinkPath = stateSaver["BeginLinkPath"] as string;
    if (!String.IsNullOrEmpty(BeginLinkPath))
      File.Delete(BeginLinkPath);

    string InfoLinkPath = stateSaver["InfoLinkPath"] as string;
    if (!String.IsNullOrEmpty(InfoLinkPath))
      File.Delete(InfoLinkPath);
  }
  catch (Exception ) { }
}
During rollback or uninstall it is best if you don't throw any exceptions, so catch and ignore these.

My web app may create data files during normal operation. The rollback/uninstall methods could remove these files. However, an upgrade installation will usually uninstall the software first (even if RemovePreviousVersions is false). To keep the data files through an upgrade, these must be left in place. My documentation tells the user to remove these data files for a complete uninstall.

There should be no need to undo any file permission changes as the directory should be about to disappear.

NTFS File Permissions

In my Install() method, I want to let my web app write to its directory and any sub-folders. In IIS5, aspnet_wp is the worker process that runs all ASP.NET web apps using username "ASPNET". In IIS6 and IIS7, the w3wp worker process runs ASP.NET web app application pool using username "Network Service". This is the code that changes the file permissions for both these user accounts for the TargetDir:

DirectorySecurity security = Directory.GetAccessControl(TargetDir);
try
{
  FileSystemAccessRule access = new FileSystemAccessRule("ASPNET",
    FileSystemRights.Modify,
    InheritanceFlags.ContainerInherit InheritanceFlags.ObjectInherit,
    PropagationFlags.None,
    AccessControlType.Allow);
  security.AddAccessRule(access);
  Directory.SetAccessControl(TargetDir, security);
}
catch (Exception ) { }

try
{
  FileSystemAccessRule access = new FileSystemAccessRule("Network Service",
    FileSystemRights.Modify,
    InheritanceFlags.ContainerInherit InheritanceFlags.ObjectInherit,
    PropagationFlags.None,
    AccessControlType.Allow);
  security.AddAccessRule(access);
  Directory.SetAccessControl(TargetDir, security);
}
catch (Exception ) { }

Changing the text in the final finished wizard page

I had originally wanted to provide a means of starting the web app on the final wizard page. So far I have not found a way to do this (either provide links/buttons on the page, or put checkboxes with options that are actions when Finish is pressed).

However I did manage to update the text to include the words "Now click on the Product links in the Start Menu".

Using the Microsoft SDK orca tool, I opened a first cut of my MSI. Create a New Transform. In the Control table, find the row for FinishedForm BodyText. Right-click on the Text column and Copy Cell. Paste it into Notepad and alter the text as you want. Copy the text and Paste Cell back into the Text cell. Select Generate Transform and save as an .MST file. Note that you cannot use "\r\n" in the cell text although line breaks can be pasted.

Use the Microsoft SDK msitran tool to apply the transform to an MSI, ie in the PostBuildEvent for your setup project. I found that calling msitran directly caused the build to fail because it has a return code of 1. I therefore had to use a batch file that called msitran and then did something innocuous to clear the return code, eg:

msitran.exe -a ..\AlterSetupFinishedText.mst ProductSetup.msi
dir

Vista Installation

If you double-click on the MSI itself in Vista, then it will fail with an obscure message "The installed was interrupted before.." This is because of lack of administrator privileges, even if you are logged in as an Administrator.

The simplest solution to this problem is to run the Setup.exe program that the web app project creates - you are prompted by Vista UAC to OK this privilege escalation.

The alternative is to Run as Administrator a command prompt, and then enter msiexec /i yourInstaller.msi

And finally, sign your assemblies and MSI

Ensure that your project assemblies are signed before rebuilding the Web Setup project. Sign the MSI and associated Setup.exe.

Suggestion for Microsoft

Can you provide a means of accessing the current MSI database in .NET Installer assemblies please.