Athena-express with AWS and federated query
For all the cool kids playing along you’d know that I posted a primer to AWS Athena a few days ago. If you missed that post you can check it out here on my site. If you don’t have time and you don’t know anything about Athena, know this:
Athena is a serverless SQL service that enables querying of structured, semi-structured and unstructured data using familiar SQL syntax language.
Sound good? Moving right along…I am introducing the next part in my Athena series which is as you may have guessed, athena-express. This is a nodeJS library and can be used to query Athena. There is a pretty decent write up about athena-express on the AWS blog here.
If there is already a good write up, why am i here?
- You’re bored
- We’ll be talking about federated queries
- I’ll provide a lambda example
- I’ll provide a link to the updates needed to athena-express to work with federated queries
Federated queries
This to me is kind of a big deal and went under the radar, at least in my space. This allows us to query other data sources using Athena, like RDMS, NoSQL etc. This only went GA Nov 2020 in 2 regions. It came to Sydney (the best region) early 2021 around February.
This means that you can use Athena to query DynamoDB via a Lambda connector and use something like:
select count(*) from awesometable;
This could be hooked up to something like an API Gateway or AppSync for example. Which I think is pretty cool, there are other challenges with that part but still, I think it’s cool.
We need mods!
Firstly, you don’t need to use athena-express, check out the github link and it will detail why Athena express has been developed. The AWS SDK provides everything you need to use Athena, but it’s nice to have a helper lib to make it easier.
If you do use the AWS SDK you’ll find that it’s not as simple as just running a query, it actually submits the query to the Athena service and then gives you an ID, you then use the ID to get the results…if they are ready and it worked.
And now you know why there are helpers :)
What I found with federated queries and this package and testing with the AWS SDK was that athena-express is missing the catalog parameter. I’ve submitted a pull request to include this parameter and updated the readme.
You can find my PR here:
https://github.com/ghdna/athena-express/pull/56
and my fork:
https://github.com/talkncloud/athena-express
Smash the thumbs up button on my PR if you want to see it added!
If you’re using federated queries you’ll need this PR (from my testing anyway) to enable these types of queries to work.
How do I use this thing?
I won’t be providing the full stack for this one, I’m putting together a stack example and will share soon. To get you started you’ll need to work through the following to enable federated queries (assuming dynamoDB):
- Athena workgroup version 2
- Datasource with DynamoDB connector (Lambda SAM)
- DynamoDB with some data
Once you have the above you can create a new Lambda function and use the athena-express library. I’ve provided an example on how to use the library with the updated catalog parameter but, the source repo has more examples which might be better.
You can find my example here: https://github.com/talkncloud/aws/tree/main/athena-express
When you’re all done you can test your new function just like any other function and you should see the results of your query.
Tip: Use the Athena console to test your query first
It didn’t work did it?
If you got it to work first go, well done! I struggled with this bit, the permissions to enable this were tricky to narrow down. Which is strange because AWS wrote an article on this particular problem and supplied examples that I totally overlooked….
https://docs.amazonaws.cn/en_us/athena/latest/ug/federated-query-iam-access.html
For whatever reason, I missed the disconnect between our athena-express function and the athena connector lambda function. Our athena-epxress function needs to invoke the connector! It isn’t just handed off to Athena to magically take care of.
Thats a wrap
A nice short one today, hopefully this helps someone with athena-express and federated queries. I’m pretty excited to see what people build with federated queries and integration with other AWS services.
If you’re using this already, reach out, keen to hear how you’ve gone so far!
Related Posts
AWS Athena primer
This post is about the basics of AWS Athena. Like many of the AWS products there is a bit to know about this beast and getting started can sometimes be a little challenging.
Read moreAWS temporary creds with SSO and a CDK workaround
It’s been a rainy past couple of days here so I’m catching up on some writing, something that i’ve been meaning to talk about for sometime is how to manage temporary credentials and access.
Read moreAWS Cognito, the good the bad the ugly and CDK
Alrighty, I’ve been using AWS Cognito for a while now and thought it was time for a write up. We’ll run through a quick overview and highlight the good and bad, there is also a CDK example provided in the talkncloud github repo.
Read more