DIY Web Analytics

Julien de Charentenay
7 min readMar 16, 2021

The views/opinions expressed in this story are my own. This story relates my personal experience and choices, and is provided for information in the hope that it will be useful but without any warranty.

Image by Benjamin Hartwich from Pixabay

Wikipedia’s web analytics definition is better and more extensive than what I would be able to come up with — so please refer to it if web analytics is what you are after.

I use web analytics for my side projects and my requirements are limited. I have used freely available tools — and am still using them — but I am trying to move away from them as “if you’re not paying for the product, you are the product”.

This page describes a low code and (hopefully) low cost approach I implemented to provide basic DIY web analytics using AWS AppSync, DynamoDB and JavaScript to tag the following web traffic characteristics:

  • Page views;
  • Referral sources.

This story is divided into two sections. The first section focuses on setting up the AWS infrastructure and the second section focuses on setting up page tagging on the website.

The following image provides a preview of the architecture employed so you can have a sneak peak at it before deciding to dive into the details.

Software architecture schematic preview, inspired by “How it works” diagram on AWS AppSync overview

AWS Infrastructure

The proposed approach leverages AWS AppSync , a scalable GraphQL API platform, to define endpoints that can be queried by the website without requiring boiler plate code to manage endpoints, conversion of queries to database, etc.

I am using one AppSync per website with this post based on the monitoring of my personal website.

Create AWS AppSync API for data logging

This step creates the API used for capturing website page views and referral information:

  • Log onto AWS Console and navigate to the AppSync home page;
  • Create a new API using “Build from scratch” — I called my AppSync API Charentenay.me Page Tagging API. At this stage, AWS created a new blank API that needs a schema and the definition of resolvers;
  • The schema is defined by selecting “Edit Schema” in the “Getting Started” page of the API just created. The schema can be manually edited, but a more straightforward approach for a new schema is to use “Create Resources” and “Define new type”. The type I used record the creation timestamp, the URL visited and the referral URL (if available) making use of AWS built-in types AWSDateTime and AWSURL and is shown below.
type CharentenayMeVisit { 
id: ID!
timestamp: AWSDateTime
visitedUrl: AWSURL
referralUrl: AWSURL
}
  • During creation theAWS dynamoDB table is created as well as queries, mutations, subscriptions that are added to the schema. As the API is solely for the creation of CharentenayMeVisit , queries, mutations, subscriptions should be revisited to prevent CharentenayMeVisit to be queried, updated, deleted via this API. During the testing phase, I left these functionalities available.
  • The timestamp field in the input CharentenayMeVisit is to be created automatically, not given as an API input parameters. This is done by (a) removing timestampfrom CreateCharentenayMeVisitInput and (b) editing the resolver createCharentenayMeVisit and amending its request mapping template as follows — credit to hatboyzero’s response on stackoverflow:
{
"version": "2017-02-28",
"operation": "PutItem",
"key": {
"id": $util.dynamodb.toDynamoDBJson($util.autoId()),
},
#set( $myFoo = $util.dynamodb.toMapValues($ctx.args.input) )
#set( $myFoo.timestamp = $util.dynamodb.toDynamoDB($util.time.nowISO8601()) )
"attributeValues": $util.toJson($myFoo),
"condition": {
"expression": "attribute_not_exists(#id)",
"expressionNames": {
"#id": "id",
},
},
}
  • This should be it for the creation of the API. The AWS dynamoDB table associated with the API is created automatically when the schema is created and should appear in the AWS dynamoDB page.

Testing the AWS AppSync in-browser

AWS provides a query editor to test the schema and its associated mutation and query. The query editor is available under the API home page. The screenshot below shows an example creating an entry by running the mutation createCharentenayMeVisit with a referral and visited URL link. After running the query, an associated entry is available in the corresponding dynamoDB table.

Running createCharentenayMeVisit mutation through AWS Query Editor

Testing the AWS AppSync using curl

The AWS AppSync can also be tested using curl , a command line tool to run HTTP requests. This test requires the API KEY and URL, both available under AWS AppSync API Settings page and noted in the command as {API_KEY} and {API_URL}. The following command triggers a mutation similar to the one tested via the in-browser query editor above:

curl -X POST \
-H "x-api-key: {API_KEY}" \
-H "Content-Type: application/json" \
-d '{ "query": "mutation Visit { createVisit(input: {visitedUrl: \"http://www.test.com/page\" }) { id timestamp visitedUrl } }" }' \
{API_URL}

Testing the query listCharentenayMeVisits allows for checking that the API is working by returning stored entries — if this query has not yet been removed from the schema:

curl -X POST \
-H "x-api-key: {API_KEY}" \
-H "Content-Type: application/json" \
-d '{ "query": "query List { listVisits { items { id timestamp } } }" }' \
{API_URL}

Schema cleaning

Queries, mutations and subscriptions are automatically generated when creating the schema from the type. One of the created query can be used to retrieve the list of visits or mutations can be used to amend or delete visits. These APIs are should be removed as they would otherwise allow anyone to change recorded visits.

The type CharentenayMeVisit is renamed to Query as otherwise a Query type would be required in this context. In addition, the mutation name is made generic to generalise the calls to the API and hence enabling an identical client code to be used in different applications.

input CreateVisitInput {
timestamp: AWSDateTime
visitedUrl: AWSURL
referralUrl: AWSURL
}
type Mutation {
createVisit(input: CreateVisitInput!): Query
}
type Query {
id: ID!
timestamp: AWSDateTime
visitedUrl: AWSURL
referralUrl: AWSURL
}

Website integration

The website is developed using Vue CLI, the standard tooling for Vue.js. AWS AppSync provides an integration based on installing the AWS Amplify toolchain and CLI. I preferred a more manual approach in this context.

  • Install the axios library to provide easy HTTP access: npm install axios — I have since looked into GraphQL JavaScript libraries and have settled for Apollo in its Vue.js integration Vue Apollo implementation but direct HTTP calls is sufficient in this use case.
  • The API KEY and URL are defined in a Vuex module, with the key and URL values replaced in the deployment pipeline. The Vuex module is shown below:
/**
* Vuex module for handling secrets
*/
const state = () => ({
web_analytics_api_key: "{WEB_ANALYTICS_API_KEY}",
web_analytics_api_url: "{WEB_ANALYTICS_API_URL}"
});
export const mutations = {};
const actions = {};
const getters = {};
export default {
namespaced: true,
state,
mutations,
actions,
getters
};
  • A Vue.js component called WebAnalyticsis created to handle the call to the API as shown below. This approach allows for the call to the API to be made when the page is rendered and this call can be added to any page by adding the component to the page declaration alongside the HTML tab <WebAnalytics /> . The component is as follows:
<!-- Web Analytics component -->
<template>
<div></div>
</template>
<script>
import axios from 'axios';
import { mapState } from 'vuex';
export default {
name: "WebAnalytics",
props: {},
mounted: function() {
var input;
if (document.referrer) {
input = '{visitedUrl: "' + window.location.href + '", referralUrl: "' + document.referrer + '"}';
} else {
input = '{visitedUrl: "' + window.location.href + '"}';
}
axios.post(this.web_analytics_api_url, // Url
{ // data
query: 'mutation { createVisit(input: ' + input + ' ) { id } }'
},
{ // Config
headers: {
'Content-Type': 'application/json',
'x-api-key': this.web_analytics_api_key
}
}
)
.then(function() {})
.catch(function(error) { console.log("Error during page tagging", error); });
},
computed: {
...mapState({
web_analytics_api_key: state => state.secrets.web_analytics_api_key,
web_analytics_api_url: state => state.secrets.web_analytics_api_url,
})
}
};
</script>

The approach above allows for the secrets module and WebAnalytics component to be re-used as-in for other website with the API KEY and API URL replaced in the deployment pipeline.

The following items have not been discussed above, but should be considered in the deployment:

  • Security and unauthorised use;
  • Using a CloudFront distribution and route 53 as discussed here to implement a user friendly URL and additional protection;
  • Collation of additional information to evaluate further metrics.

Conclusion

It is reasonably straightforward to use AWS AppSync and some relatively simple JavaScript for the development of a simple DIY page tagging to record visitor page views and referral. I have introduced this approach on my personal website and will see how well it performs in practice — but it only gets a small number of visitors.

AWS dynamoDB provides a rudimentary query interface as well as the ability to download the information as CSV file for local processing. I will investigate Business Intelligence tooling to process the information… and generate pretty dashboards.

So whilst it is possible to do, one should consider the risks, benefits and whether a commercial alternative may provide better value for money…

For me, it was an interesting way to spend a weekend and learn two new tricks: GraphQL and AppSync!

--

--

Julien de Charentenay

I write about a story a month on rust, JS or CFD. Email masking side project @ https://1-ml.com & personal website @ https://www.charentenay.me/