Friday, May 25, 2012

Flume-Solr Integration

Integrate Flume with Solr. I have created a new sink. This sink is usually used with the regexAll decorators that perform light transformation of event data into attributes. This attributes are converted into solr document and commited in solr.

What is Solr
Solr is an open source enterprise search server based on Lucene. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language.

What is Flume
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming
data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management.

I have used flume-0.9.3 and apache-solr-3.1.0 for this POC.

RegexAllExtractor decorator prepare events that contain attributes ready to be written into an Solr. Implementing a RegexAllExtractor decorator is very simple.

DSLR Caring Tips for Teens: How to Avoid Lens Fungi and Molds

It is not uncommon these days to see young people carrying their DSLR (digital single lens reflex) cameras. Because digital SLRs are thriving in the market and purchasing one is not as difficult as it was a decade ago, almost everyone can have their own professional-looking cameras. However, it is safe to say that a bunch of teenage DSLR fanatics are not well aware of the proper steps on how to take care of their expensive gadgets. Some teens think that a good camera bag, lens hood, lens filter, and underwater protective covering are enough to safeguard their cameras from wear and tear. Although such gears are vital in keeping DSRLs away from the repair shops, they are not adequate enough to protect their lenses as far as lens fungus and molds are concerned.

Among the various lens problems DSLR owners can encounter are fungus and molds. When the camera lens is exposed to this problem, it can be easily subjected to various issues that could lead into lens damage. Fungi and molds can be found on lenses that are often subjected to sudden temperature changes (from cold to humid; vice versa). This is because lens fungi and molds are caused by moisture that is trapped inside or outside the lens. When fungi and molds cultivate inside the lens, camera owners can notice almost clear “island formation-looking” or “spider web-looking” dirt on their lens. This dirt cannot be easily wiped or cleaned. Commonly, owners having issues with lens fungi and molds end up going to repair shops, and worst, buying new lens.

Apple Technologies

Apple offers a complete ecosystem regarding designers. All the parts which include equipment, the systems, and the programmer methods are intended by means of one particular corporation, and also their many created to band together effortlessly — developing less strenuous, a lot more spontaneous knowledge therefore designers can target producing great apps. Your Xcode programmer methods bundle will provide you with an effective, easy-to-use development natural environment that also includes programs to make great Mac PC and also as apps.

Apple programs and also technological know-how
Ios, Mac PC OPERATING SYSTEM Times, and also Safari provide a few incredible opportunities regarding designers to turn their own ideas directly into simple fact. Developed on the same central groundwork, us and also Mac PC OPERATING SYSTEM Times present dynamic circumstances that can modify that which you idea had been achieved within applications. Using Safari, you can establish todays and also tomorrow’s web software the direction they are usually used to be seen.

Saturday, May 5, 2012

S3 instead of HDFS with Hadoop

In this article we will discuss about using S3 as replacement of HDFS (Hadoop Distributed File System) on AWS (Amazon Web Services), and also about what is the need of using S3. Before coming to original use-case and performance of S3 with Hadoop let’s understand What is Hadoop and What is S3

Let’s try to understand what the exact problems are & why HDFS is not used in cloud. When new instances are launched on the cloud to build a Hadoop cluster they do not have any data associated with them. So one approach is to copy the entire huge dataset on them, which is not feasible due to various reasons including bandwidth, time to copy & associated cost. Secondly after completion of jobs once again you will need to copy the result back before terminating cluster machines otherwise the result will be lost when instances are terminated & you will not get anything. Also due to associated cost running the entire cluster just for data collection is not feasible.