One exercise that you’d like to do is find out which countries your visitors are coming from. There are several reasons for doing this:
1) You might want to tailor your content to that particular demographic.
2) If you have lots of visitors from a particular country, you might want to consider adding a version of your website in that particular language.
3) If your site has IP-based targeting for ads (programs such as Adsense have an IP-targeting component), this will help you understand why, or why not, your users are clicking on your ads.
The way I did it was to download a flat file that include IP-to-country mapping data from the link found here. This is a free database. There are other versions that claim to be more accurate, but they charge a fee. For my purposes, the free version is sufficient.
I first loaded this file into a database and then use a table join to lookup the country code from the IP number. This turned out to be a time-consuming exercise. Then, I remembered that in data warehousing, you want to do as much of your data transformation outside of the database as possible. Applying that principle, I moved the country lookup portion into the perl processing routine prior to loading the data into the database. This move proved to be an excellent time-saver.
Below I show the perl code for matching IP address with country code. There are 3 basic steps:
1. Read IP/Country mapping file.
2. Convert IP address to IP number.
3. Find country code based on IP number
The code for each step is shown below:
1. Read IP/country mapping file
open (IN1,’ip2country.txt’);
## ip2country.txt is the file that stores the
## IP number/country mapping data. Assuming the
## following format: IP_START,IP_END,COUNTRY_CODE.
$i=0;
while (<IN1>) {
chomp;
@ips = split (“,”);
$ip_start[$i] = $ips[0];
$ip_end[$i] = $ips[1];
$ip_country2[$i] = $ips[2];
$i++;
}
2. Convert IP address to IP number.
Assume IP address is already stored in the variable $ip_address
@ipp = split (/\./,$ip_address);
$ip_number = $ipp[0]*256*256*256+$ipp[1]*256*256+$ipp[2]*256+$ipp[3];
3. Find country code based on IP number
## We want to find the country code where $ip_number is between $ip_start and $ip_end.
while ($ip_number > $ip_end[$j]) {
$j++;
}
if ($ip_number > $ip_start[$j]) {
$country = $ip_country2[$j];
} else {
$country = ‘NA’;
}