File operations in Amazon S3 with Python boto3 library and Flask Application

Chandra Shekhar Sahoo
7 min readOct 9, 2020

--

Introduction

In this blog, we will try to learn how to perform file operations in Amazon S3 with Python boto3 library and also make our customized flask application to upload and download files from S3 bucket.

But before proceeding further one can wonder what is S3? what is boto3? and most important thing is why we require these concepts? where we will use?

Data has become the driving factor to technology growth, how to collect, store, secure, and distribute data which lead to increase in the utilization of cloud architecture to store and manage data also at the same time maintains consistency and accuracy. The cloud architecture gives us the ability to read, write , upload and download files from multiple platform, and here comes AWS Simple Storage Service (S3) into picture.

What is S3?

Amazon Simple Storage Service (S3) is one of the service of Amazon Web Services (AWS) that allows users to store data in the form of objects. We can store data ranging from images, video, and audio all the way up to backups, or website static data (CSS,.js files). Access privileges to S3 Buckets can also be specified through the AWS Console, the AWS CLI tool, or through provided APIs and libraries. But how can we interact with S3 objects programmatically to create bucket, upload files, download files etc.? . So here boto3 python library helps you to perform all those file operations.

What is Boto3?

Boto3 is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto3 provides an easy to use, object-oriented API, as well as low-level access to AWS services.

Now as we have understood the basic about S3 and boto3. To acquire more information please go through this beautiful documentation on Amazon S3 Amazon Documentation and Boto3 Docs.

We cover few file operations which are very common :

  1. S3 Bucket Creation
  2. Upload file to Bucket
  3. Download file from a bucket
  4. Move files across buckets
  5. File Deletion from bucket
  6. Bucket Deletion
  7. Get list of objects in a bucket
  8. Check whether an object exist in a bucket or not

Pre-requisite

  1. Install boto3 from pypi
pip install boto3

2. It is best practice to create credentials and config files and keep them under .aws directory in the name of “credentials” and “config” in the users home directory. Never ever hard code credentials in the code.

To make it run against your AWS account, you’ll need to provide some valid credentials. If you already have an IAM user that has full permissions to S3, you can use those user’s credentials (their access key and their secret access key) without needing to create a new user. Otherwise, the easiest way to do this is to create a new AWS user and then store the new credentials.

File Path : ~/.aws/credentials

[default]
aws_access_key_id = ACCESS_KEY
aws_secret_access_key = SECRET_KEY

File Path : ~/.aws/config

[default]
region = YOUR_REGION

Now we all set up then let’s deep dive into each operations…

  1. Create Bucket
import boto3

s3 = boto3.client('s3')
s3.create_bucket(Bucket='my-bucket')

Note : Bucket parameter is mandatory and name of bucket be in lowercase separated by ‘-’.

2. Upload file to Bucket

import boto3

# Create an S3 client
s3 = boto3.client('s3')

filename = 'file.txt'
bucket_name = 'my-bucket'

# Uploads the given file using a managed uploader, which will split up large
# files automatically and upload parts in parallel.
# des_filename = Destination File name
s3.upload_file(filename, bucket_name, des_filename)

3. Download file from a bucket

import boto3
import botocore

BUCKET_NAME = 'my-bucket' # replace with your bucket name
KEY = 'my_image_in_s3.jpg' # replace with your object key

s3 = boto3.resource('s3')

try:
s3.Bucket(BUCKET_NAME).download_file(KEY, 'my_local_image.jpg')
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise Exception

4. Move files across buckets

import boto3
s3 = boto3.resource('s3')
copy_source = {
'Bucket': 'mybucket', # Source Bucket Name
'Key': 'mykey' # Source File Name
}
#'otherbucket' - Destination Bucket Name#'otherkey' - Destination File Name
s3.meta.client.copy(copy_source, 'otherbucket', 'otherkey')

5. File Deletion from a Bucket

import boto3

client = boto3.client('s3')
#'mybucketname' - Bucket Name#'myfile.whatever' - File Nameclient.delete_object(Bucket='mybucketname', Key='myfile.whatever')

6. Bucket Deletion

import boto3

# Create an S3 client
s3 = boto3.client('s3')
# Call S3 to delete bucket 'some-bucket'
response = s3.delete_bucket(
Bucket= 'some-bucket',
)
# Print response
print(response)

7. List of objects in a bucket

from boto3 import client

conn = client('s3')
for key in conn.list_objects(Bucket='bucket_name')['Contents']:
print(key['Key'])

8. Check whether an object exist in a bucket or not

import boto3
client = boto3.client('s3')
s3_key = 'Your file without bucket name e.g. abc/bcd.txt'
bucket = 'your bucket name'
content = client.head_object(Bucket=bucket,Key=s3_key)
if content.get('ResponseMetadata',None) is not None:
print "File exists - s3://%s/%s " %(bucket,s3_key)
else:
print "File does not exist - s3://%s/%s " %(bucket,s3_key)

Above we saw some basic file operations in Amazon S3.

Now we implement above concept and create a web application dashboard to upload, dashboard and list of files in a bucket.

Flask Application for file operations in S3

  1. Project Setup

We need to install flask and boto3 library to make our Flask Application:

$ pip install boto3
$ pip install flask

2. Creation of Bucket to store data or fetch list of files.

To create bucket we can go to AWS Console and select S3 services from Services menu and create the bucket. Here, we have created a bucket with bucket name as ‘test-s3-operation’. Below is the screenshot for the same.

3. Flask Application code walkthrough

The following is the file structure of single-file Flask application :

├── requirement.txt      # stores our application requirements
├── app.py # Flask Application starting point
├── download_files # directory to store the downloaded files
├── s3_helper.py # S3 operations helper function
├── templates
│ └── s3_storage_dashboard.html # html page to interact with S3
└── upload_files # directory to store the downloaded files

A. requirement.txt — it store the python packages with version used in project

boto3==1.15.15
botocore==1.18.15
click==7.1.2
Flask==1.1.2
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.10.0
MarkupSafe==1.1.1
python-dateutil==2.8.1
s3transfer==0.3.3
six==1.15.0
urllib3==1.25.10
Werkzeug==1.0.1

B. s3_helper.py This file contains helper functions to upload, download, and list files on our S3 buckets using the Boto3 SDK.

import boto3


def upload(file_name, bucket, object_name):
s3_client = boto3.client('s3')
response = s3_client.upload_file(file_name, bucket, object_name)
return response


def download(file_name, bucket):
s3 = boto3.resource('s3')
output = f"download_files/{file_name}"
s3.Bucket(bucket).download_file(file_name, output)
return output


def list_all_files(bucket):
s3 = boto3.client('s3')
contents = []
for item in s3.list_objects(Bucket=bucket)['Contents']:
contents.append(item)
return contents

C. app.py — The point where where our Flask Application run triggers

import os
from flask import Flask, render_template, request, redirect, send_file
from s3_helper import list_all_files, download, upload

app = Flask(__name__)
UPLOAD_FOLDER = "upload_files"
BUCKET = "test-s3-operartions"

@app.route('/')
def start():
return "All this are set.Ready to Go!!!"

@app.route("/home")
def home():
contents = list_all_files("test-s3-operartions")
return render_template('s3_storage_dashboard.html', contents=contents)

@app.route("/upload", methods=['POST'])
def upload_files():
if request.method == "POST":
f = request.files['file']
f.save(os.path.join(UPLOAD_FOLDER, f.filename))
upload(f"upload_files/{f.filename}", BUCKET, f.filename)
return redirect("/home")


@app.route("/download/<filename>", methods=['GET'])
def download_files(filename):
if request.method == 'GET':
output = download(filename, BUCKET)
return send_file(output, as_attachment=True)


if __name__ == '__main__':
app.run(debug=True)

Basically it has four endpoints implemented in this Flask Application:

  • The /home endpoint will display home page with all list of files in our S3 bucket “test-s3-operation” download links, and also we can upload files to same bucket.
  • The /upload endpoint will take local file from the system and then call the upload_files() method from s3_helper.py files to upload a file to S3 bucket “test-s3-operation”
  • The /download endpoint will download the file name from S3 bucket and uses the download_files() method from s3_helper.py files to download the file to ‘download_files’ folder.

D. s3_storage_dashboard.html — html content for the Flask Application

<!DOCTYPE html>
<html>
<head>
<title>File Operation in S3</title>
</head>
<body>
<div class="content">
<h3 align ='center'>File Operations in S3</h3>

<div>
<h3>1. Please Upload file to S3 here </h3>
<form method="POST" action="/upload" enctype=multipart/form-data>
<input type=file name=file>
<input type=submit value=Upload>
</form>
</div>
<div>
<h3>2. List of All files in the bucket:</h3>
<p>Download it by clicking on the links.</p>
<ul>
{% for item in contents %}
<li>
<a href="/download/{{ item.Key }}"> {{ item.Key }} </a>
</li>
{% endfor %}
</ul>
</div>
</div>
</body>
</html>

Here we are done with our code , we can start our application run with:

$ python app.py

Go to http://localhost:5000/home in the bowser and you can see the homepage.

Lets check the code working by uploading a filename “test_s3_upload.xlsx” .

Lets Check in S3 to confirm whether file uploaded or not, and we can see our file exist in S3:

From home page , we can download the file by simply clicking on the file name links then and save the file on our machines.

Conclusion

From this blog we saw some operations to manage file operations in Amazon S3 bucket using Python Boto3 SDK and also implemented a flask application to that stores files on AWS’s S3 and allows us to download the same files from our application. We no longer require servers to handle the storage of our files as we can handle those using AWS Simple Storage Service. If is easy to develop, deploy and increases the availability of our applications to end-users.

For more information follow the beautiful documentation on Amazon S3 Amazon Documentation and Boto3 Docs.

Will try to bring some more exciting topics in future blogs. Till then:

Code Everyday and Learn Everyday ! ! !

If you liked this blog, hit the 👏 and stay tuned for the next one!

--

--

Chandra Shekhar Sahoo
Chandra Shekhar Sahoo

Written by Chandra Shekhar Sahoo

Freelancers, Python Developer by Profession and ML Enthusiasts

No responses yet