Choose color scheme

Monthly Archives: February 2016

  • Install Single Node Hadoop on Mac

    Install Single Node Hadoop on Mac

    Operating System: Mac OSX Yosemite
    Hadoop Version 2.7.2

    Pre-requisites
    We need to enable SSH to localhost without a passphrase.

    Go to System Preferences, then check “Remote Login” to ON.

    Now in a terminal window, ensure that the following succeeds with no passphrase.
    $>ssh localhost

    Download Hadoop Distribution
    Download the latest hadoop distribution from http://mirrors.ibiblio.org/apache/hadoop/common/hadoop-2.7.2/

    Hadoop Configuration Files

    Go to the directory where your hadoop distribution is installed.

    Then change the following files
    hadoop_distro/etc/hadoop/hdfs-site.xml

    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    </configuration>
    

    hadoop_distro/etc/hadoop/core-site.xml

    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
        </property>
    </configuration>
    

    hadoop_distro/etc/hadoop/yarn-site.xml

    <configuration>
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    </configuration>
    

    hadoop_distro/etc/hadoop/mapred-site.xml

    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    </configuration>
    

    Format HDFS
    $bin/hdfs namenode -format

    Start HDFS
    $sbin/start-dfs.sh

    Start YARN
    $sbin/start-yarn.sh

  • Amazon Web Services Data Security

    Amazon Web Services Data Security

    AWS provides many options to encrypt data that you put it in the cloud.

    Some of the options include:

    1. Client Side Encryption
    2. Server Side Encryption

    Client Side Encryption

    Client Side Encryption refers to encrypting the data before you put it in the AWS Cloud. In this case, you can either manage your own key or use AWS Key Management System (KMS) key.

    Server Side Encryption

    Server Side Encryption refers to AWS encrypting data as it is written into the cloud. Here you have the choice of providing your own key or AWS KMS managed key or AWS S3 managed key.

    If you require high levels of confidentiality for your data, I suggest the following:

    • Create a Customer Master Key (CMK) in a region.
    • Provide the CMK to AWS API and it will create a data key server side

    The CMK can only encrypt up to 4kb of data. Hence it is perfect to encrypt the data key. The data key has no size restrictions.

    Using CMK with Server Side Encryption is a good solution to confidentiality needs in AWS.

  • Apache Kafka in a public cloud by IBM BlueMix

    IBM BlueMix offers Apache Kafka as part of the IBM Message Hub Service.

    What is BlueMix?
    Bluemix is an open-standards, cloud-based platform for building, running, and managing applications Build your apps, your way Use the most prominent compute technologies to power your app: Cloud Foundry, Docker, OpenStack. Extend apps with services A catalog of IBM, third party, and open source services allow the developer to stitch an application together quickly. Scale more than just instances Development, monitoring, deployment, and logging tools allow the developer to run and manage the entire application. Layered security IBM secures the platform and infrastructure and provides you with the tools to secure your apps. Deploy and manage hybrid apps seamlessly Get a seamless dev and management experience across a number of hybrid implementations options. Flexible pricing Try compute options and services for free and, when you’re ready, pay only for what you use.