23 Apr 2013

How Google Sites, Docs, Forms, Picasa and Google Groups reunited long lost Childhood Friends

Sometime in 1st week of July 2009, a couple of School time Friends residing in different cities were interacting with each other and feeling the pain of not able to meet each other since 25 years of leaving the School. They were also feeling the itch of lost contacts of most of the Friends. I had passed out of the School in 1984, when there were no mobile phones and even Landline phones were scarce. So, we lost contacts.

They all were saying lets get together. But, the desire was not getting materialized due to one or the other reason. I was in Ghaziabad (Delhi-NCR region), Rajesh Chandra Prasad was in Rajkot (Gujrat), Rajiv Ranjan Dwivedi was in Varanasi, Rajeev Kumar Singh was in Vasundhara (Ghaziabad) and Anil Kumar Jha in Noida.

All of a sudden, one fine morning, Rajesh called me and said, I am just coming to Delhi on 25th July, call other friends and lets meet. Called up Rajiv Ranjan at Varanasi and he immediately agreed. Anil Kumar Jha and Rajeev Kumar Singh too agreed immediately. As we anticipated a small gathering, I offered that the venue of get-together to be my residence in Rajendra Nagar, Ghaziabad. We informed a couple of more friends about the get-together. Now, the surprising chain reaction started. Nidhi Chaturvedi (Sharma) and Bimal Kumar were most instrumental in giving more oxygen to the reaction. Everyone started informing every other person, he/she was in touch and passing on the contact details to each other and also informing about the get-together of 25th July 2009. Finally, the get together happened on 25th July 2009 with extraordinary emotions and warmth. The weather conditions was not very supportive (it was hot and humid) and aggravated by power cut. People from different batches gathered at my residence at around 08:00 pm. Many people later on regretted that they did not join. It was a real extraordinary cheerful gathering as most of people were seeing each other after almost 25 years.

Get-together of Baraunians on 25th July 2009
About our common legacy and the School: I and all my friends referred in the above description had studied in a School called "Kendriya Vidyalaya" situated in the township, Urvarak Nagar, HFC (Hindustan Fertilizer Corporation Ltd), Barauni, Bihar, India. The School is popularly known as KV HFC Barauni. Barauni is an Industrial belt. Children of Employees of HFC, IOC (Indian Oil Corporation Ltd), Thermal Power Plant and nearby localities used to study in the School. I and most of my Batchmates had joined the School in 1972 in 1st standard when the School was established. I passed out 12th standard in 1984 and joined Allahabad University. My visits to Barauni got limited to once a year. My Father took voluntary retirement in 1988 and went back to Ghazipur to look after our ancestral Farmland. Then I never visited Barauni back. The Fertilizer Factory of HFC Barauni was closed by the Government of India owing to huge losses and the Urvarak Nagar Colony became deserted. All the Employees and their families having close community bonding got lost and disconnected. The KVS (Kendriya Vidyalaya Sangathan) too decided to close the School in April 2003. Although the Students and Parents from nearby areas revolted, but nothing happened. Later, the Urvarak Nagar Colony was handed over to Armed Forces SSB (Sashastra Seema Bal) to operate its Training Centre. This made the Colony once again habitated, and the School reopened. Then came an over enthusiastic Politician Ram Vilas Paswan (then Chemicals and fertiliser minister), who mooted a non-feasible idea of reopening the Fertilizer Factory, based on Natural Gas, with no Natural Gas available in near vicinity. He also, gave marching orders to SSB to vacate the premises.  The Colony again became deserted, and KVS tries to close the School at the beginning of each session year. We all Ex Students of the KV HFC Barauni are emotionally attached to the School and we would never ever like to see our Alma Mater to get closed like this.

The KV HFC Barauni, School Building and Playground in 1988
The KV HFC Barauni, School Building and Playground in 1988

The news of 25th July get-together, started spreading like wildfire. The Mini-Get-Together sparked a Nuclear reaction. Information about newly found Old Friends started pouring in everyday. I started maintaining a spreadsheet file with contact details. So, everyone started informing me about finding of new, old friend. Then in turn I called back the newly discovered School time friend (be it of any Batch), noted the complete contact details and made entry to the Spreadsheet file. This made me central point of this endeavor, though effort of everyone was equal. By this time we started calling our Group as "Baraunians". We also coined a slogan of the Group, "Be Baraunian forever". I started circulating the Spreadsheet in morning and in evening with new entries through email to everyone in the list. But, as the list became longer, the emailing software started refusing to send mail to so many people in one go. So, I created a group ID on GoogleGroups as Baraunians@googleGroups.com on 30th July 2009. Further, the circulated Spreadsheet was difficult to refer by people as they were not able to collate through the latest version. Most of the time people called me to know the contact details of someone or the other. To overcome this problem I uploaded the spreadsheet to Google Docs (now known as Google Drive) and embedded it on Google Sites. People started referring to the Google Site website configured onto a subdomain of my then blog domain as baraunians.techds.in. But, in September 2009, I registered Baraunians.com and relaunched the Group's website. The relaunched website www.Baraunians.com was welcomed by the existing members and became super hit. We received calls from many people who were desperately searching their Friends and found our Website through Google-Search.

The Google's features that helped the Website creation:
  1. Google Sites
  2. Google Docs
  3. Google Forms
  4. Google Picasa Albums
  5. Google Groups
  6. Google Search
  7. Google Mail

www.Baraunians.com Website
The www.Baraunians.com Website

In the meantime, people started sending photographs from their personal collections and were posted in an organized manner embedding them via Picasa on the Website. Whosoever visited the Site was bound to become nostalgic. People also started sharing stories from their childhood & School days and those were duly posted in appropriate sections of the Website. Parents too started reconnecting to their long lost colleagues. You would like to visit the website at http://www.Baraunians.com.

Immediately after the 25th July Get-Together, we all started discussing need of a Mega-Get-Together. And people started aligning towards the proposal of 3rd week of December 2009. We all started discussing a mutually convenient date and zeroed upon 26th December 2009. For venue, we discussed about several suggestions and even visited many of them to understand the costing.

On 16th October 2009, we finalized the Centaur Hotel, Delhi (situated near IGI Airport) for the Mega-Get-Together (MGT). We discussed a lot about contribution amount and mode of collection. We were not able to organize a Bank Account for the Group in such a short span of time (due to the involved formalities). Rajiv Ranjan Dwivedi, came forward and offered his Bank account which he was not operating to be utilized to deposit Contributions. Contributions started pouring in via Cheque and Cash to the account. The details of contributions received were transparently posted on our Website.

Thereafter, we had several meetings to finalize the finer details of the Mega-Get-Together.

Planning Meeting at Fortune Hotel, Noida
Planning Meeting at Fortune Hotel, Noida
Planning Meeting at Centaur Hotel, Delhi
Planning Meeting at Centaur Hotel, Delhi

It was also decided that we will invite our erstwhile School Teachers and will reimburse all travel expenses. The stay arrangements for outstation Baraunians and Teachers was booked at Ginger Hotel (Taj Group) near New Delhi Railway station.

Needless to say that the MGT-2009 reunion was a mega success, where people from all parts of the World participated. Even the Photographers and Videographers were spellbound to see such an emotional gathering. They later told us that it was an unbelievable event. You may read the complete narration of the MGT-2009 event and the nostalgic reactions by some of the attendees.

Thereafter, we had MGT on 25th Dec 2010 in Barauni (at our School Campus), on 24th Dec 2011 in Varanasi and on 24th Dec 2012 in Ranchi. The details may be found on Baraunians Website. Baraunians also started organizing Mini-Get-Togethers in different Cities.
There were four Catalyst elements behind the burgeoning success of the Baraunians Group.

  1. Latent desire of every Baraunian to get re-united.
  2. Forceful visit of Rajesh Chandra Prasad from Rajkot to Delhi on 25th July 2009.
  3. Alok Mall's statement to me saying that because we had organized a Mini-Get-Together in July, the fizz of the mega desire of grand meet will die down. I took it as a challenge.
  4. Joining of Umesh Sharma of Batch-1980 to the Group on 13th Sep 2009. Prior to his joining we were rudderless. He gave great guidance to the Group with his able Leadership and Organizing Power.
The primary Goal behind the formation of this Group was to get the Baraunians, reunited and a platform to interact with each other. Reunion. The motive was purely emotional & nostalgic and there was no commercial or political or formal agenda.

The day, it deviates from the primary Goal, I will have to rethink on the existence of the Group-Mail and the Website www.Baraunians.com.

18 Nov 2012

Google Analytics Limits and the "(Other)" Bucket

Each Standard Report in Google Analytics (GA) is a pre-calculated on a daily basis called "Dimension value aggregates". Each pre-calculated report stores only 50,000 rows per day. The top 49,999 rows get actual values. The the last 50,000th row gets the value of "(other)" with the sum of all the remaining row values.

Its a “good problem to have” - per day Landing Pages more than 50,000 - wow !

In the above illustration (with one day data range), we are noticing the "(other)" bracket in Landing Pages Report because we are sending more than 50,000 Landings per day for this standard report. Generally this works fine. The totals are always correct. Also most people only view the top 100 results and don't jump to the 49,999 row. But, when I try to do a long tail analysis of Landing v/s Bounces with Estimated True Value, so as to arrive a list of Pages to improve, I get bottle-necked. The problem gets more aggravated when we try to select weeks of data range.

For multi-day reports a page that is grouped in the "(other)" category one day, may not necessarily be grouped in the "(other)" category another day. So when running a report for a multi-day date range, you may run into inconsistencies as some pages (or other dimension value) in the long-tail may be included in the “(other)” bucket or its own row across days.

Further, for multi-day standard reports, the maximum number of aggregated rows per day is 1M/D, where D is the number of days in the query. For example:
A report for the past 30 days would process 33,333 rows per day (e.g. 1,000,000/30).
A report for the past 60 days would process a maximum of 16,666 rows per day (e.g. 1,000,000/60).

Is there a way out to get around the "(Others)" bucket issue?

Yes, we can partially circumvent the "(Other)" bucket issue. I said partially, because we will be able to see data upto 250K Visits, after which the GA's Sampling algorithm kicks in.

We can create an advanced segment to match all sessions and apply that segment to a standard report. For example we can create an advanced segment for the dimension Visitor Type that matches the regular expression .* (this is NOT the same as applying the "All Visits" Segment).

Let us see the original report with this Advanced Segment applied.

Wow, it works !

In cases where the report query cannot be satisfied by existing aggregates (i.e. pre-aggregated tables), GA goes back to the raw session data to compute the requested information. This applies for reports with Advanced Segments too. Reports with advanced segments use the raw session and hit data to re-calculate the report on-the-fly.

Typically, advanced segments are used to include or exclude sessions from being processed. But when we create a segment to match all sessions, we end up only by-passing the pre-calculated reports and force the entire report to be re-calculated.

Few points to note: The numbers between pre-calculated and on-the-fly calculated reports may differ as each type of report has different limits. Pre-calculated reports only store 50k rows of data per day but process all sessions (visits).

Reports calculated on-they-fly can return up to 1 million rows of data, but the only process 250k sessions (visits). After the 250k visits, sampling kicks in. 250k sampling is default, which can be slided upto max 500k.

So this solution works best when we have less that 500k visits in our date range. (We can find the number of sessions in the date range by looking at the visits metrics in the traffic overview report).

How Sampling Works in Google Analytics

20 May 2012

How Unique Visitors Are Calculated in Google Analytics

The Visitor Metrics data in Google Analytics don't match up in various parts of the UI and API. So here's how how they are calculated (as explained by Nick Mihailovski from GA Team).

This metric is extremely powerful because it represents "reach" of a site, and gives you a true view of total visitors for most combinations of dimensions, across the date range.

Currently there are 2 calculations of Unique Visitors in Google Analytics and they depend on other dimensions present in the query:

If you query for Visitors with only time/date dimensions:

Each session has a timestamp of the first hit in the previous session. (utma cookie format = Domain-Hash.Visitor-Token.First-Visit-Start.Previous-Visit-Start.Current-Visit-Start.Visit-Count). As Google Analytics goes through all the sessions in the date range, it will increment Visitors if the previous timestamp is before the start of the date range. This works well because it requires no memory, so it's fast and how the overview reports are calculated. The only issue is if the browser time is off, the timestamps will be incorrect, leading to some bad data.
In Custom Reports this metric is called Visitors.

In the GA API both calculations are mapped to ga:visitor and one is picked depending on the dimensions selected.

If you query for Visitors with any other dimension, or include a filter of a non-time dimension:

Each session also has a Visitor ID. This ID is the same value for a Visitor for all their sessions. As Google Analytics processes each session, it stores each ID in memory once, then returns the total count. So while this method is a bit more reliable in calculating data, it requires memory and is a bit slower.

In Custom Reports this metric is called Unique Visitors.

In the GA API both calculations are mapped to ga:visitor and one is picked depending on the dimensions selected.

The reason why there are two calculations, is that Google Analytics wishes to provide fast user experience. The main overview report gets viewed many times, so to keep the experience fast, the timestamp method is used. In other custom reports, GA wishes the data to be as accurate as possible, so the Visitor ID approach is used.

22 Apr 2012

Should I Check eMail ?

I stumbled upon an interesting post Managing Distraction: How and Why to Ignore Your Inbox. The whole article is interesting to read, but the Graphics says it all in short.

15 Apr 2012

Most Effective SEO Tactics - Content is the King

In order of effectiveness, the most important SEO Tactics adopted by Search Engine Marketers are as:
  1. Content Creation - "Content is King"
  2. Keyword and Key-Phrase Research
  3. Title Tags
  4. SEO Landing Pages
  5. External Link Building
  6. URL Structure
  7. Blogging
  8. Meta Description Tags
  9. Digital Asset Optimization (images, videos, podcast, webinars, PDFs etc)
  10. Social Media Integration
  11. XML Sitemap
  12. Internal Linking
  13. Competitor Benchmarking

According to the 2012 MarketingSherpa Search Engine Marketing Report, of all SEO tactics available, “content creation works the best, but takes the most work,” says Kaci Bower, Research Analyst, MECLABS.
I always say that, if you just consistently focus on the Content and even forget everything else, your site is sure to be a winner.

Here is a Graphics from MarketingSherpa, which gives an idea of efforts v/s effectiveness of SEO Tactics.

Effectiveness of SEO Tactics

11 Mar 2012

Understanding Google Crawling & Indexing

Pierre Far (Webmaster Trends Analyst at Google) spoke on "Understanding Google Crawling & Indexing" at Think Visibility SEO conference at Alea Casino (Leeds) on 3rd March 2012.

I have tried to sum up the points he touched in his presentation (collected from various Blogs and tweets). Plus, I have added my own interpretation.

Google gets URLs by crawling, links, site maps and the add URL feature.

There are always more URLs than Google can fetch, so they try to get as many as possible without destroying your website. To do this they use a relaxed crawl rate.

Google increase the URL crawl rate slowly and see if response time goes up. If your site can’t handle the crawler they will not crawl much of your site.

Google checks Robots.txt only about once per day to help keep the load off your server? Having a +1 button on your site can override robots.txt? Both these points are interesting to me.

Google sets a conservative crawl rate per server. So too many domains or URLs will reduce crawl rate per URL. If you use shared hosting, then this could easily be problematic for you. If you do not know how many other websites are on the same IP-Address as you, then you may be surprised. You can easily check this by putting your domain or IP-Address into Majestic’s neighborhood checker to see how many other websites are hosted on the same IP-Address. If one shared site on the same IP has large number of URL and it is not yours,  then you could be losing crawl opportunities, just because there’s a big site that isn’t connected to you in any way on the same IP. You can’t really go complaining to Google about this. You bought the cheap hosting, and this is one of the sacrifices you made.

Google crawl more pages than those in your sitemap but it does help them decide which pages are more popular.

If a CMS has huge duplication, Google then knows, and this is how it notifies you of duplicates on GWMT. This is interesting because it is more efficient to realize a site has duplicate URLs at this point than after Google has had to analyze all the data and deduplicae on your behalf. Google then picks URLs in a chosen order. One important to choose one page in comparision to other is Change Rate of page content.

Googlebot can be blocked from accessing your server, so you need to make sure your hosts have no issues or they will think your site is down. Biggest and smallest ISPs can block Googlebot at the ISP level. Because ISPs need to protect their bandwidth, the fact that you want Google to visit your site does not necessarily mean it will be so. Firewalls at the ISP may block bots even before they see your home page. They may (more likely) start throttling bits. So if your pages are taking a long time to get indexed, this may be a factor.

Strong recommendation – set up email notifications in Web Master Tool. Setup email forwarding on webmaster tools as a priority – this is very important so you don’t miss any error messages.

Make sure your 404 page delivers a 404 status – or it will get indexed which happens a lot. Soft error pages create an issue and so Google tries hard to detect those. If they can’t, they end up crawling the soft error as a crawl slot (at the expense of another URL crawl, maybe). So if you don’t know what a soft error is, it is when an error page returns a 200 response instead of a 404 response. You can use Firefox add-on Live http header to check this.

Google has to pick the best URL and title for your content. They can change it to better match the query. They then generate a snippet and site links. Changing them improves the CTR. It’s as if you are writing a different title for each query.

If server spikes with 500 errors, Googlebot backs off. Also, firewalls etc can block the bot. This can after a few days, create a state in Google, that says the site is dead. If Googlebot gets 503 error on robots.txt they stop crawling. Be careful, if only some part of your site is offline, do not to serve a 503 on robots.txt.

Googlebot is getting better and better at seeing JavaScript / ajax driven sites and pages.

For displaying result, Google needs to:
Pick a URL
Pick Title: Usually Title Tag, sometimes changes tag based on user query. This is win win for everyone.
Generate Snippet: Will create stuff on page, but strongly recommends using Rich Snippets.
Generates Site-links: depends on query and result as to whether this appears. If you see a bad site-link issue (wrong link) check for canonicalisation issue.

Pierre pointed out that all this is in the Google Webmaster Documentation - http://support.google.com/webmasters/?hl=en

4 Mar 2012

The Great Engineering Elephant

This gigantic animal, 12-metre high by 8-metre wide, lives in the largest warehouse.

Massive, fully-functioning robots made from reclaimd materials are a green tech lover's dream come true.

Inspired by the Sultan's Elephant, an interactive show featuring a mechanized elephant, the massive robot looks surprisingly lifelike aside from a few nuts and bolts and some joints at the trunk and legs showing.

The 12-meter high x 8 meter wide elephant was pieced together using 45 tons of reclaimd wood and steel.

When the majestic animal goes out for its walk, it is like architecture in motion departing a steel cathedral. The 49 passengers on board embark on an amazing journey on the Ile de Nantes. Each time the pachyderm goes out, it is a unique spectacle for everyone to enjoy.

From the inside, the passengers will be able to see the moving gears that power the legs. They can make the elephant trumpet and control some its movements, thus becoming truly a part of the Machine. On the back of the Elephant, it’s like being on the 4th floor of a moving house, with a breathtaking view of the banks of the Loire River. In this time-travelling carriage, the passengers can voyage to the imaginary world of Jules Verne in the city where he was born.

12 m high and 8 m wide
50 tonnes
Wood: American Tulip
Metallic carcass irrigated by 3000 liters of hydraulic oil
450 HP engine
An indoor lounge with French doors and balconies
A terrace accessible via stairways
Route: Approximately 45 minutes
Speed – 1/3 km per hour

I don't think that you will be able to find a more impressive reclaimed robot...I dare you.