Azure Storage Account and Access Keys

Azure Storage Accounts provide scalable, durable, and highly available storage for various data types. Similar to AWS S3 buckets, they are a fundamental service for storing data in the cloud. This guide will walk you through creating a storage account and obtaining its access keys for programmatic access over the internet.

Step 1: Create an Azure Storage Account

  1. Sign in to the Azure Portal: Open your web browser and navigate to the Azure Portal. Sign in with your Azure account credentials.

  2. Navigate to Storage Accounts:

    • Once logged in, you can search for "Storage accounts" in the search bar at the top of the portal.

    • Alternatively, from the Azure services dashboard, click on "Storage accounts".

  3. Create a New Storage Account:

    • On the "Storage accounts" page, click the + Create button.

  4. Basics Tab Configuration:

    • Subscription: Select the Azure subscription you want to use.

    • Resource group: Choose an existing resource group or create a new one. A resource group is a logical container for Azure resources.

    • Storage account name: Enter a globally unique name for your storage account. This name will be part of the URL used to access your storage (e.g., https://yourstorageaccountname.blob.core.windows.net). The name must be between 3 and 24 characters in length and can contain only lowercase letters and numbers.

    • Region: Select the Azure region where you want your storage account to be hosted. Choose a region close to your users for lower latency.

    • Performance:

      • Standard: Recommended for most scenarios, including blobs, files, queues, and tables. Uses magnetic disk drives.

      • Premium: Recommended for scenarios requiring high performance, such as unmanaged disks for VMs. Uses solid-state drives (SSDs).

    • Redundancy: This setting determines how your data is replicated for durability.

      • Locally-redundant storage (LRS): Data is replicated three times within a single data center.

      • Zone-redundant storage (ZRS): Data is replicated across three availability zones within a single region.

      • Geo-redundant storage (GRS): Data is replicated to a secondary region hundreds of miles away.

      • Read-access geo-redundant storage (RA-GRS): Similar to GRS, but also provides read access to the data in the secondary region.

    • Click Review + create.

  5. Review and Create:

    • Review your settings. If validation passes, click Create.

    • Azure will deploy your storage account. This process usually takes a few minutes.

Step 2: Obtain Access Keys

Access keys provide full administrative control over your storage account. They are highly sensitive and should be treated with the same care as root passwords.

  1. Navigate to Your Storage Account:

    • Once the deployment is complete, click Go to resource or find your newly created storage account in the "Storage accounts" list.

  2. Access Keys Section:

    • In the left-hand menu of your storage account blade, under Security + networking, click on Access keys.

  3. Display and Copy Keys:

    • You will see two access keys: key1 and key2. Each key has a Key value and a Connection string value.

    • Click the Show keys button to reveal the actual key values.

    • Copy the "Key" value for either key1 or key2. It's a long alphanumeric string. You can use either key; having two allows for key rotation without downtime.

    • You can also copy the Connection string, which includes the account name and key, formatted for direct use in many applications.

    Example of a Key (blurred for security): **********************************************************************************

    Example of a Connection String (blurred for security): DefaultEndpointsProtocol=https;AccountName=yourstorageaccountname;AccountKey=**********************************************************************************;EndpointSuffix=core.windows.net

  4. Securely Store Your Keys:

    • Crucially, store these keys in a secure location. Do not embed them directly in client-side code, public repositories, or easily accessible plain-text files.

    • For applications, use environment variables, Azure Key Vault, or managed identities for Azure resources to handle credentials securely.

Step 3: Accessing the Storage Account Over the Internet Using Keys

Once you have your storage account name and an access key, you can use Azure SDKs or REST APIs to interact with your storage account from anywhere.

General Approach:

  1. Choose an Azure SDK: Azure provides SDKs for various programming languages (e.g., Python, .NET, Java, Node.js, Go).

  2. Install the SDK: Add the relevant Azure Storage SDK package to your project.

  3. Initialize the Client: Use your storage account name and one of the access keys (or the connection string) to initialize a client object for the specific storage service you want to use (e.g., Blob service client, File service client).

  4. Perform Operations: Use the client object to perform operations like uploading blobs, listing containers, downloading files, etc.

Example (Conceptual - Python using azure-storage-blob):

# pip install azure-storage-blob

from azure.storage.blob import BlobServiceClient

# NEVER hardcode keys in production code!
# Use environment variables, Azure Key Vault, or Managed Identities.
AZURE_STORAGE_ACCOUNT_NAME = "yourstorageaccountname"
AZURE_STORAGE_ACCOUNT_KEY = "YOUR_ACCESS_KEY_COPIED_FROM_PORTAL"

# Create the BlobServiceClient object
try:
    connect_str = f"DefaultEndpointsProtocol=https;AccountName={AZURE_STORAGE_ACCOUNT_NAME};AccountKey={AZURE_STORAGE_ACCOUNT_KEY};EndpointSuffix=core.windows.net"
    blob_service_client = BlobServiceClient.from_connection_string(connect_str)

    # Example: List containers
    print("Listing containers in the storage account:")
    for container in blob_service_client.list_containers():
        print(f"- {container.name}")

    # Example: Upload a blob (assuming a container named 'mycontainer' exists)
    # container_client = blob_service_client.get_container_client("mycontainer")
    # with open("local_file.txt", "rb") as data:
    #     container_client.upload_blob(name="remote_file.txt", data=data)
    # print("File uploaded successfully!")

except Exception as e:
    print(f"An error occurred: {e}")

Example (Conceptual - C# using Azure.Storage.Blobs):

// Install the NuGet package: Azure.Storage.Blobs
// dotnet add package Azure.Storage.Blobs

using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using System;
using System.Threading.Tasks;

public class AzureStorageAccess
{
    public static async Task Main(string[] args)
    {
        // NEVER hardcode keys in production code!
        // Use environment variables, Azure Key Vault, or Managed Identities.
        const string AZURE_STORAGE_ACCOUNT_NAME = "yourstorageaccountname";
        const string AZURE_STORAGE_ACCOUNT_KEY = "YOUR_ACCESS_KEY_COPIED_FROM_PORTAL";

        // Create the BlobServiceClient object using the connection string
        // The connection string can also be retrieved from the Azure Portal.
        string connectionString = $"DefaultEndpointsProtocol=https;AccountName={AZURE_STORAGE_ACCOUNT_NAME};AccountKey={AZURE_STORAGE_ACCOUNT_KEY};EndpointSuffix=core.windows.net";

        try
        {
            BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);

            // Example: List containers in the storage account
            Console.WriteLine("Listing containers in the storage account:");
            await foreach (BlobContainerItem containerItem in blobServiceClient.GetBlobContainersAsync())
            {
                Console.WriteLine($"- {containerItem.Name}");
            }

            // Example: Upload a blob to a container named 'mycontainer'
            // Ensure 'mycontainer' exists or create it:
            // BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient("mycontainer");
            // await containerClient.CreateIfNotExistsAsync();

            // string localFilePath = "local_file.txt";
            // string blobName = "remote_file.txt"; // Name of the blob in Azure Storage

            // Console.WriteLine($"Uploading {localFilePath} to {blobName}...");
            // BlobClient blobClient = containerClient.GetBlobClient(blobName);
            // await blobClient.UploadAsync(localFilePath, overwrite: true);
            // Console.WriteLine("File uploaded successfully!");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
            Console.WriteLine(ex.StackTrace); // For detailed error information
        }
    }
}

Security Best Practices for Access Keys:

  • Least Privilege: Grant only the necessary permissions. Instead of giving full storage account access, consider using Shared Access Signatures (SAS) for time-limited, granular access to specific containers or blobs, especially for client-side applications.

  • Key Rotation: Regularly rotate your access keys. Azure provides two keys (key1 and key2) to facilitate this: use one while rotating the other, then switch.

  • Monitor Activity: Enable logging and monitoring for your Azure Storage Account (e.g., using Azure Monitor, Azure Storage Analytics) to track access patterns and detect suspicious activity.

  • Azure AD Integration: Whenever possible, use Azure Active Directory (Azure AD) for authentication and authorization instead of access keys. This provides more granular control, integrates with identity management, and allows for role-based access control (RBAC).

  • Network Security: Configure network rules (firewalls, virtual networks) to restrict access to your storage account from specific IP addresses or networks if needed.

Cloud SFTP Storage Comparison: Azure, AWS, and GCP

Secure File Transfer Protocol (SFTP) is a widely used method for securely transferring files. When migrating to the cloud, choosing the right SFTP solution depends on factors like scalability, cost, management overhead, and integration with other cloud services. This document compares how you can achieve a 5TB SFTP storage solution on Azure, AWS, and Google Cloud Platform (GCP).

1. Azure SFTP Storage

Azure offers native SFTP support directly on Azure Blob Storage, specifically on storage accounts with a hierarchical namespace enabled (Azure Data Lake Storage Gen2). This provides a fully managed, scalable, and cost-effective solution without requiring virtual machines.

Setup Overview:

  1. Create an Azure Storage Account: Choose a Standard performance tier and enable hierarchical namespace.

  2. Enable SFTP: Toggle the SFTP feature within the storage account settings.

  3. Create Container: Create a container (e.g., sftpdata) where files will be stored.

  4. Create Local User: Set up local users with passwords or SSH keys, defining their home directory within the container and specific permissions.

  5. Connect: Use any SFTP client with the storage account's endpoint (<storage-account-name>.blob.core.windows.net on port 22).

Cost Considerations (Azure):

  • Storage: Billed per GB for data stored in Azure Blob Storage. Pricing varies by redundancy (LRS, GRS, etc.) and access tier (Hot, Cool, Archive). For 5TB in the Hot tier (LRS), it would be approximately $0.023 per GB/month for the first 50TB.

  • SFTP Endpoint: A fixed hourly charge for the SFTP endpoint (e.g., $0.30 per hour, approximately $216 per month). This is a flat rate per storage account, regardless of the number of users or containers.

  • Operations: Charges for read/write operations (transactions) on the storage account.

  • Data Transfer (Egress): Charges for data transferred out of Azure to the internet.

2. AWS SFTP Storage (AWS Transfer Family)

AWS provides a fully managed SFTP service called AWS Transfer Family. This service allows you to easily set up, operate, and scale SFTP servers that store and access data in Amazon S3 buckets.

Setup Overview:

  1. Create an S3 Bucket: This will be the backend storage for your SFTP files.

  2. Create an AWS Transfer Family Server: Select SFTP as the protocol.

  3. Choose Identity Provider: Use service-managed users, AWS Identity and Access Management (IAM), or a custom identity provider.

  4. Create Users: For service-managed users, create a user and associate them with an IAM role that grants access to the S3 bucket.

  5. Connect: Use the provided server endpoint with your SFTP client.

Cost Considerations (AWS):

  • SFTP Endpoint: An hourly charge for the SFTP endpoint being enabled (e.g., $0.30 per hour, approximately $216 per month).

  • Data Transfer (Upload/Download): Charges for data uploaded to and downloaded from the SFTP endpoint (e.g., $0.04 per GB).

  • S3 Storage: Billed per GB for data stored in Amazon S3. Pricing varies by storage class (Standard, Infrequent Access, Glacier) and region. For 5TB in S3 Standard, it would be around $0.023 per GB/month for the first 50TB.

  • S3 Operations: Charges for requests made to S3 (PUT, GET, LIST, etc.).

  • Data Transfer (Egress): Standard AWS data transfer out to the internet charges apply.

3. Google Cloud Platform (GCP) SFTP Storage

GCP does not offer a native, fully managed SFTP service comparable to Azure's Blob SFTP or AWS Transfer Family. The typical approach for SFTP on GCP involves deploying and managing an SFTP server on a Google Compute Engine virtual machine (VM), often integrating it with Google Cloud Storage using Cloud Storage FUSE.

Setup Overview:

  1. Create a Google Cloud Storage Bucket: This will serve as the backend storage.

  2. Launch a Compute Engine VM: Choose an appropriate machine type (CPU, RAM) and operating system (e.g., Linux).

  3. Install SFTP Server Software: Install and configure an SFTP server (e.g., OpenSSH) on the VM.

  4. Mount Cloud Storage Bucket: Use Cloud Storage FUSE to mount the GCS bucket as a file system on the VM, allowing the SFTP server to access files in the bucket.

  5. Configure Users and SSH Keys: Set up user accounts and SSH keys on the VM for SFTP access.

  6. Configure Firewall Rules: Open port 22 on the VM's firewall to allow SFTP connections.

Cost Considerations (GCP):

  • Compute Engine VM: Hourly charges for the VM instance based on machine type, vCPUs, and memory. This is a continuous cost for the running VM.

  • Persistent Disk: Charges for the boot disk attached to the VM.

  • Google Cloud Storage: Billed per GB for data stored in GCS. Pricing varies by storage class (Standard, Nearline, Coldline, Archive) and region. For 5TB in Standard Storage, it would be approximately $0.020 per GB/month.

  • GCS Operations: Charges for operations on the GCS bucket.

  • Data Transfer (Egress): Standard GCP data transfer out to the internet charges apply.

  • Cloud Storage FUSE: While the FUSE utility itself is free, the underlying GCS operations it performs will incur charges.

  • Managed Solutions (Marketplace): If using a pre-configured SFTP Gateway solution from the GCP Marketplace (e.g., SFTP Gateway by Thorn Technologies), there will be additional software licensing fees on top of the standard GCP infrastructure costs (VM, storage, network). These can range from $0.07 to $0.15 per hour for the software license.

Comparison Summary

Feature

Azure SFTP (Native Blob)

AWS SFTP (Transfer Family)

GCP SFTP (Compute Engine + GCS)

Management

Fully managed service

Fully managed service

Self-managed (requires VM setup & maintenance) or Marketplace

Underlying Storage

Azure Blob Storage (Data Lake Gen2)

Amazon S3

Google Cloud Storage

Scalability

Highly scalable (Blob Storage)

Highly scalable (S3)

Scalable with VM resizing/auto-scaling, GCS is highly scalable

Authentication

Local users (password/SSH key), AAD integration (preview)

Service-managed users, IAM, Custom Identity Providers

OS-level users, SSH keys, PAM (if configured)

Cost Model

Endpoint hourly, Storage, Operations, Egress

Endpoint hourly, Data Transfer (in/out), S3 Storage, S3 Ops, Egress

VM hourly, Disk, GCS Storage, GCS Ops, Egress, (Software License)

Complexity

Low

Low to Moderate (IAM setup)

High (VM setup, SFTP config, FUSE, security)

Conclusion

For a 5TB SFTP solution:

  • Azure and AWS offer fully managed SFTP services that significantly reduce operational overhead. They are generally simpler to set up and scale, making them excellent choices for most use cases where you need a hands-off SFTP solution. Azure's native SFTP on Blob Storage is a direct and cost-effective approach.

  • GCP requires a self-managed approach using a Compute Engine VM, which gives you more control but comes with higher management complexity and potentially higher costs due to VM runtime and associated resources. Marketplace solutions can simplify deployment but add software licensing fees.

When choosing, consider your team's expertise, existing cloud infrastructure, and specific requirements for control and customization versus ease of management and cost efficiency.

No comments: