Mayank Singh: Java

Showing posts with label Java. Show all posts

Saturday, April 06, 2013

Apache Lucene - Brief Concept & Code Snippet

Searching ... a simple concept with a complex algorithms. That's how I see it. Always fascinated by the complexity and wanted to implement and use some sort of basic algorithms.
So I end using Apache's Lucene library to for search functionality. In this post I will walk through on some basic terminology/concept of this library along with small code snippets to initialize and make use of this library.

Lucene - it is text based search library. To built up an index source information could be anything a database, file system or web sites. You can feed in info from any source you like to index.

Data is indexed by Lucene using "Inverted Indexing" technique. That means it will "retrieves the pages related to a keyword instead of searching a pages for a keyword".

On Apache's website there are below two things are available for downloads

Lucene - It is an engine which can be used by programmers to customize searches.
Solr - It's a war file can be used by non-programmers. It can be directly deployed on tomcat, jetty or any web server.

Lucene Concepts

Document - It is unit of search/index. Just like a row when we fire sql queries.
Fields - A document is consist of one of more fields. Columns in a row.
Searching - It is done using "QueryParser" class
Queries - It has its own mini language. It does have ability to add weightage to fields, known as boosting.
Building Indexes - To build index lucene needs a directory on file system where information can be stored. While indexing records, we need to specify

what all fields needs to be stored and
what all fields needs to be indexed.

Directory - FSDirectory is the abstract class which points to the index directory. It has direct sub-classes. I have listed down below.

It is recommended to let lucene pick up implementation class based on environment.
SimpleFSDirectory - Poor for concurrent performace
NIOFSDirectory - Poor choice for windows
MMapDirectory - It has some issue with JRE bug.
RAMDirectory - It cannot handle huge indexes.
There are few more implementation classes provided by lucene which can be easily located in java docs provided by lucene

Coming to to coding part.

Index Directory

File idxFile = new File(INDEX_DIR); //e.g. /work/index/
Directory idxDir = FSDirectory.open(idxFile);

Prepare Analyzer

Map<String, Analyzer> map = new HashMap<String, Analyzer>(): //key, Analyzer pair
map.put(FILE_NAME, new StandardAnalyzer(Version.LUCENE_36));
Analyzer analyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(Version.LUCENE_36), map);

Index Writer Configuration

Config = new IndexWriterConfig(Version.LUCENE_36, analyzer);

Index Writer

new IndexWriter(idxDir, config);

Before we go further let's take quick peek on Analyzer.

It builds up token streams.
A policy for extracting index terms from text.
Subclasses

PerFieldAnalyzerWrapper
ReusableAnalyzerBase

PatternAnalyzer
KeywordAnalyzer
PatternAnalyzer

Moving forward with code snippets to

Create documents to be stored in index.

Field fPath = new Field(FULL_PATH, path, Field.Store.YES , Field.Index.toIndex(true, true, false));
document.add(fPath); // Add a field to document. Keep on multiple fields that needs to be stored.

Add document to index

indexWriter.addDocument(document);

Fine tune above process to build index. Once indexes ready, lets gear up for searching...

Initialize index directory

IndexReader ir = IndexReader.open(idxDir);
IndexSearcher searcher = new IndexSearcher(ir);

Prepare fields analyzers

Map map = new HashMap();
map.put(FILE_NAME, new StandardAnalyzer(Version.LUCENE_36));
Analyzer analyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(Version.LUCENE_36), map);

Query Parser

QueryParser qp = new MultiFieldQueryParser(Version.LUCENE_36, new String[]{FILE_NAME, TITLE, ALBUM, ARTIST}, analyzer);// FILE_NAME, TITLE, ALBUM, ARTIST are the fields of Document used in my example
Query query = qp.parse(srchText); //pass in the search keyword
TopDocs res = searcher.search(query, 10);

Iterate through results

for(ScoreDoc sc:res.scoreDocs) {

Document doc = searcher.doc(sc.doc);

sc.doc //Gives Document ID

doc.getFieldable(FILE_NAME).stringValue(); // Read Field Value

}

Hope this helps implementation get rolling quickly.

Thanks !!!

Tuesday, April 24, 2012

CSV Excel - Double Click Issue

So here I am again, with a fix for another issue which I tried to resolve with my colleague the other day.

Problem Statement:
We used CSVWriter to write CSV file's and on double clicking it; excel was not showing the special characters, to be precise korean characters. However if we were opening the same file in Notepad characters were rendering properly.

Issue:
The CSV file was getting generated without BOM(Byte Order Mark). BOM is nothing but the Unicode character used to signal the byte order of the text file or stream. Its code point is U+FEFF. BOM use is optional, and, if used, should appear at the start of the text stream

I detected this issue when I opened the file in Notepad++ and found "UTF8 without BOM" option was selected under "Format" menu. I then changed the format to "UTF8" and saved a copy of the file. Double click the file and Excel rendered all the korean characters.

Click here to read more on Byte Order Mark.

Fix:
We added below mentioned three statements to have BOM character at the start of the file. That's it.

OutputStream fos = new FileOutputStream("c:/test.csv");
//Write BOM Characters
fos.write(239); //0xEF
fos.write(187); //0xBB
fos.write(191); //0xBF

PrintWriter writer = new PrintWriter(new OutputStreamWriter(fos, "UTF-8"));
//Write all the required contents.

writer.close();
fos.close();

Thanks !!!

Tuesday, January 24, 2012

Broken Pipe Issue in Hibernate

Hey guys, here is one of the many key issues which I was facing in one the web based java application on which I am currently working on. Here is the development environment.

Java 1.6
Struts 2
Hibernate with C3p0
Postgres
Tomcat 6

Assuming database server & tomcat are both up an running and are on different machines. At any point if there is any network issues or for some reason database server goes down for few seconds, the c3p0/hibernate connection pool goes into bad state. And application keeps on throwing "Socket Exception" / "Broken Pipe" issue. And even if the database server comes up, it is not able to re-create the connection pool. The only solution to this issue is to re-start the tomcat server. There are other users who faced this issue with MySQL database.

There is key entry which was missing from my "hibernate.cfg.xml". Adding this entry resolves this issue.

<property name="hibernate.connection.provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>

Along with the above I do have following properties in my application

I tested this setting and it works fine.
I just fixed my broken pipe !!! :)

Thursday, December 15, 2011

Eclipse Memory Enhancements

Default Eclipse settings are optimal for machines running on 1 GB/low memory. So if you have a machine with high memory, you can use below settings to optimize performance. Add the following settings in eclipse.ini(residing in your eclipse home directory) file and start/restart the eclipse.

-Xmn128m
-Xms1024m
-Xmx1024m
-Xss2m
-XX:PermSize=128m
-XX:MaxPermSize=128m
-XX:+UseParallelGC

Thanks.

Wednesday, October 05, 2011

SQL Restriction In Hibernate Criteria Query

Here is an example to add a SQL Restriction in hibernate Criteria query.
Let's consider we have

A Student.java entity
A stored procedure in place which returns total marks obtained by a student in each subject.

public Class Student {
    private Integer id;
    private String name;
    .
    .
    //getters & setters methods
}

Assuming get_marks() stored procedure will result records in following format.

id	Maths	Enlish	Science
1	78	75	23
2	74	70	97

I want to list down the Students who has got more than 90 in Maths. Here goes the Hibernate Criteria query.

Criteria c = session.createCriteria(Student.class)
		.addSqlrestriction(" {alias}.id in (SELECT id FROM get_marks() WHERE maths_total > ? ) ", new Double(90.0), Hibernate.Double);

In above statement {alias}.id will refer to the id column of Student table. The above addSqlrestriction example shows the SQL restriction statement with only one parameter.
What if you have to pass multiple arguments?
In such cases you need to pass array of Objects & array of Type as second & third parameter to the addSqlrestriction(String, Object[], Type[]). And in SQL String have ? where ever argument is required, as shown in above example.

I hope this helps.
Thanks.

Wednesday, August 10, 2011

Hibernate Interceptors

Are you using hibernate and thinking of using Database Triggers? Then wait....

If most of your database operations are done via stored procedures then having triggers in place is the only choice to intercept any DML calls.

But since we are using hibernate we need a more object oriented way of handling things.
Hibernate interceptors is similar to triggers in database. Monitoring DML operations via triggers is the most efficient way, but it won't be easy to maintain, understand and manipulate business logic when all database operations are done via hibernate.

Coming straight to the point, hibernate provides "org.hibernate.Interceptor" interface to have the business logic to perform pre or post operations around DML hibernate calls. Following methods are required to be implemented against "org.hibernate.Interceptor".



public interface Interceptor {



public void afterTransactionBegin(Transaction arg0) {



}



public void afterTransactionCompletion(Transaction arg0) {



}



public void beforeTransactionCompletion(Transaction arg0) {



}



public int[] findDirty(Object arg0, Serializable arg1, Object[] arg2,

Object[] arg3, String[] arg4, Type[] arg5) {

return null;

}



public Object getEntity(String arg0, Serializable arg1)

throws CallbackException {

return null;

}



public String getEntityName(Object arg0) throws CallbackException {

return null;

}



public Object instantiate(String arg0, EntityMode arg1, Serializable arg2)

throws CallbackException {

return null;

}



public Boolean isTransient(Object arg0) {

return null;

}



public void onDelete(Object arg0, Serializable arg1, Object[] arg2,

String[] arg3, Type[] arg4) throws CallbackException {

}



public boolean onFlushDirty(Object arg0, Serializable arg1, Object[] arg2,

Object[] arg3, String[] arg4, Type[] arg5) throws CallbackException 		{

return false;

}



public boolean onLoad(Object arg0, Serializable arg1, Object[] arg2,

String[] arg3, Type[] arg4) throws CallbackException {

return false;

}



public boolean onSave(Object arg0, Serializable arg1, Object[] arg2,

String[] arg3, Type[] arg4) throws CallbackException {

return false;

}



public void postFlush(Iterator arg0) throws CallbackException {



}



public void preFlush(Iterator arg0) throws CallbackException {



}

}

Since you won't be interested in implementing all of the methods, hibernate provides "org.hibernate.EmptyInterceptor" class which has empty implementation of the Interceptor interface. Now, as per your need you can override any of the desired method. Below is the sample class which I plugged in hibernate configuration to print out the primary key of any object which is getting saved.



package com.test.interceptor;



import java.io.Serializable;



import org.hibernate.EmptyInterceptor;

import org.hibernate.type.Type;



public class InsertMonitorInterceptor extends EmptyInterceptor{



@Override

public boolean onSave(Object entity, Serializable id, Object[] state,

String[] propertyNames, Type[] types) {

System.out.println("Saving..." + id + ": " entity);

return super.onSave(entity, id, state, propertyNames, types);

}

}

Once you have your Interceptor class implementation ready it can be very easily plugged with any existing hibernate configuration as shown below.



new Configuration().configure(DBUtil.class.getResource("hibernate.cfg.xml"))

.setInterceptor(new InsertMonitorInterceptor())

.buildSessionFactory();

That's it. You can have different interceptor implementation plug it in to start using them.
Hope this article will be helpful. happy Intercepting.

Thanks.

Tuesday, February 23, 2010

RSS - Really Simple Syndication

With RSS, it is possible to distribute up-to-date web content from one web site to several other web sites or other applications. RSS has been designed to show selected data.
There are several web sites which provide RSS feeds. A consumer can consume these RSS feeds from multiple web sites and can view a consolidated information in his application. If you are designing your own application and wondering how to consume these RSS feeds, then you are at right spot.
RSS feeds are nothing but XML data, so either you can write your own parser to parse this XML data or you use a third party library to parse it for you.
A RSS feed may look like:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="2.0">

<channel>
  <title>RSS Test</title>
  <link>http://www.rsstest.com</link>
  <description>RSS sample file</description>
  <item>
    <title>RSS Introduction</title>
    <link>http://www.rsstest.com/rssintro</link>
    <description>RSS Introduction</description>
  </item>
  <item>
    <title>RSS Java Code</title>
    <link>http://www.rsstest.com/java</link>
    <description>How to consume RSS in java</description>
  </item>
</channel>
</rss>

In the following example, I have used informa.jar. A third party library, as I don't want to get into details of parsing XML.
I have used a feed from discovery site, which will display the top stories from the discovery site.

public class Client { public static void main(String[] args) { Client client = new Client(); client.readRSS(); } public void readRSS() { try { ChannelIF channel = FeedParser.parse(new ChannelBuilder(), new URL("http://dsc.discovery.com/news/subjects/animals/xdb/topstories.xml")); //Collection feeds = OPMLParser.parse(inpFile); Collection collection = channel.getItems(); Iterator it = collection.iterator(); ItemIF itemIF; while(it.hasNext()) { itemIF = (ItemIF)it.next(); System.out.println("Title: " + itemIF.getTitle()); System.out.println("Link: " + itemIF.getLink()); System.out.println(itemIF.getDate()); System.out.println("Description: " + itemIF.getDescription()); } System.out.println("Title: " + channel.getTitle()); System.out.println("Copyright: " + channel.getCopyright()); System.out.println("DONE"); } catch(Exception e){ e.printStackTrace(); } } }

That's all.
Enjoy!!! Have Fun !!! :)

Thursday, June 18, 2009

Special characters in web application

I've been adding text resources in my application so as to display the web application in differnt languages. The problem I faced was that whenever a special characted comes the browser display's "?". The browser is not able to display the special character inspite of selecting UTF-8 language.

So I made some changes in the code to handle UTF-8 character set.

Make sure that all JSP pages have
<%@ page contentType="text/html; charset=UTF-8"%>

In case if you are having html pages, use meta tag to add character set information.
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />

Second, Make changes in the server's configuraton file. I am using Sun's web server. So in my sun-web.xml file I added <parameter-encoding default-charset="UTF-8"/> to look like


<sun-web-app>
 <parameter-encoding default-charset="UTF-8"/>
.
.
</sun-web-app>

In case of apache tomcat, you have specify URIEncoding attribute in Connector tag in server.xml.

But this is what something server will be handling, and it's in not our hand. To make sure that application uses UTF-8 character set, a filter can bge added to web application.


import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
 
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.ServletException;
import javax.servlet.Filter;
import javax.servlet.FilterConfig;
import javax.servlet.FilterChain;
 
import java.io.IOException;
 
public class CharacterEncodingFilter implements Filter 
{    
 
 private FilterConfig fc;    
 
 public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException 
 {        
  
  HttpServletRequest request = (HttpServletRequest) req;        
  HttpServletResponse response = (HttpServletResponse) res;        
  
  response.setContentType("text/html; charset=UTF-8");        
  request.setCharacterEncoding("UTF8");        
  
  chain.doFilter(request, response);        //do it again, since JSPs will set it to the default        
  
  response.setContentType("text/html; charset=UTF-8");        
  request.setCharacterEncoding("UTF8");    
 }        
 
 public void init(FilterConfig filterConfig) 
 {        
  
  this.fc = filterConfig;    
 }        
 
 public void destroy() 
 {        
  
  this.fc = null;    
 }
}

Once you are done writiing the java class, add the filter configuration to the web.xml file.


<filter>        
 <filter-name>CharacterEncodingFilter</filter-name>        
 <filter-class>com.its.struts.action.CharacterEncodingFilter</filter-class>    
</filter>

<filter-mapping>        
 <filter-name>CharacterEncodingFilter</filter-name>        
 <servlet-name>action</servlet-name>
</filter-mapping>

That's all !!! :)

Saturday, March 14, 2009

Array Binding In Spring

Sending data from JSP to controller in easy no matter whether it is string, integer or an array. Problem occurs in displaying JSP page if the data in present in command class and it is an array.
Following is sample JSP code to bind an array of names to the command class in spring. The code will both send and display the data.


<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<%@ taglib prefix="form" uri="http://www.springframework.org/tags/form" %>
<%@ taglib prefix="core" uri="http://java.sun.com/jstl/core" %>
<html>
<head>
<title>Test</title>
</head>
<body>
<form:form commandName="test" method="post" action="welcome.do" >
    <core:set var="flag" value="false" />
    <!-- Create the n number of name fields if data is present -->
    <core:forEach items="${test.name}" var="n" varStatus="status">
        <form:input path="name[${status.index}]"  />
        <core:set var="flag" value="true" />
    </core:forEach>
    <!-- If data is not present then create 2(default) name fields -->
    <core:if test="${flag eq false}">
            <form:input path="name"  />
            <form:input path="name"  />
    </core:if>
    <input type="submit" value="submit" />
</form:form>
</body>
</html>

In the above example String[] name; is used in the command class.
That All !!! :)

Saturday, February 28, 2009

XML Parsing

There is a high rise in information exchange between application and XML has become the standard. Here is an simple basic XML parsing example which gives an basic idea of fetching information from an XML file.

The sample XML file:

<?xml version="1.0"?>
<company name="my comapany">
<employee id="1">
<name>Mike</name>
</employee>
<employee id="2">
<name>Mark</name>
</employee> </company>

Java code to read the XML file:

package pkg;

import java.io.File;

import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;

import javax.xml.parsers.DocumentBuilderFactory;

import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;import org.w3c.dom.Element;

import org.w3c.dom.NodeList;import org.xml.sax.SAXException;

public class XMLReader {

public void readXML(String fileName) throws ParserConfigurationException, SAXException, IOException {

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

DocumentBuilder documentBuilder = factory.newDocumentBuilder();

Document document = documentBuilder.parse(new File(fileName));

//GET THE ROOT ELEMENT

Element root = document.getDocumentElement();

System.out.println("Root: " + root.getNodeName());

System.out.println("Company Name: " + root.getAttribute("name"));

//GET THE LIST OF CHILD ELEMENTS UNDER ROOT

NodeList employeeList = root.getElementsByTagName("employee");

System.out.println("Total employees: " + employeeList.getLength());

Element employee;

String name = null, id = null;

for(int i=0;i<employeeList.getLength();i++) {

employee = (Element)employeeList.item(i);

//FETCH ID OF EMPLOYEE

if(employee.hasAttributes()) {

id = employee.getAttributes().getNamedItem("id").getNodeValue();

}

//GET NAME OF THE EMPLOYEE name = employee.getElementsByTagName("name").item(0).getTextContent();

System.out.println("Id: " + id + "\tName: " + name);

}

Happy parsing !!! :)

Tuesday, November 14, 2006

Want to sing...

Ah ha! So finally after a long time I got something interesting to do...

    I remember last time when I started RnD work on decoding MP3 file...uff!!! that was so frustating. I banged my head for around 5 days continously, but all in vain, except that I learn to read at least ID3 Tag Information. Atleast I got something...

    I have been to mumbai last weekend, the trip was awesome, enjoyed a lot. Tonnes of thanks to my fosla friends in Mumbai and Gyanu. There I came across an article on Ogg Vorbis another audio file format equivalent to MP3 files. The most interesting thing is that it is open source where as MP3 is patented that means you have to pay but this one is totally free for both commercial and non-commercial use. The compression ratio is also high. Good naaa ;-)

    Hope my code will sing this format. :-)

Friday, October 06, 2006

Pool connection

Here is a quick way to make a connection pool...

Get the jar file, that will be specific to the database to which you want to connect and place it in the /WEB-INF/lib folder.

And edit /META-INF/context.xml...
<Context reloadable="true"> .... <Resource name = "jdbc/dbtest" auth = "Container" type = "javax.sql.DataSource" factory = "org.apache.tomcat.dbcp.dbcp.BasicDataSourceFactory" driverClassName = "org.postgresql.Driver" url = "jdbc:postgresql://IPAddress:port/databasename" username = "root" password = "root" maxActive = "5" initialSize = "3" maxIdle = "2" maxWait = "10000" removeAbandoned = "true" removeAbandonedTimeout = "60" logAbandoned = "true" /> .... </Context>

Java code to get connection from the pool...
InitialContext initCtx = new InitialContext(); Context ctx = (Context) initCtx.lookup("java:comp/env"); DataSource ds = (DataSource) ctx.lookup("jdbc/dbtest"); conn = ds.getConnection();

That's it. Start the tomcat server and launch the application...
Here are the details of the attributes used in xml file.

Name: The java:comp/env prefix is the name of the JNDI context for the component. This is used in the Java code. The jdbc/dbtest string is the JNDI name for the resource reference. The JNDI names for JDBC DataSource objects are stored in the java:comp/env/jdbc subcontext.

auth: Authration will be done by container. I don't know much about this. Will update soon.

driverClassName: Fully qualified Java class name of the JDBC driver to be used.

url: Connection URL to be passed to JDBC driver.

username: Database login username.

password: Database login password.

maxActive: (Default: 8) The maximum number of active instances that can be allocated from this pool at the same time.

maxIdle: (Default: 8) The maximum number of connections that can sit idle in this pool at the same time.

maxWait: (Deafult: indefinitely) The maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception.

initialSize: (Default: 0) The initial size of the pool.

removeAbandoned: (Default: false) Flag to remove abandoned connections if they exceed the removeAbandonedTimout. If set to true a connection is considered abandoned and eligible for removal if it has been idle longer than the removeAbandonedTimeout. Setting this to true can recover db connections from poorly written applications which fail to close a connection.

removeAbandonedTimeout: (Default 300) Timeout in seconds before an abandoned connection can be removed.

logAbandoned: (Default: false) Flag to log stack traces for application code which abandoned a Statement or Connection.

Note: For more details check commons-dbcp documentation.
Thanks
E ñ j Ô ÿ !!! iN D
p 0 o l
:-)

Monday, September 18, 2006

Enabling SSL in Tomcat

I have done this in windows, didn’t tried this on Linux machine.
Its easy and simple, just follow these steps:

First we will need to generate a keystore file, containing a self-signed certificate. Execute the following command on the prompt
%JAVA_HOME%\bin\keytool -genkey -alias tomcat -keyalg RSA
Open file <tomcat-home-dir>\conf\server.xml.
Find and uncomment the following code:
<Connector port="8443" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" disableUploadTimeout="true" acceptCount="100" debug="0" scheme="https" secure="true" clientAuth="false" sslProtocol="SSL" />
Change the value of the port if you need.
In this clientAuth is set to false. Change its value to true if you want client authentication. In this case if client authentication fails then it won’t let you to proceed. Set it then it to want if you you want to proceed even if authentication fails.
If the sslProtocol is set to TLS then set it to SSL. TLS is an IBM’s implementation and is not compatible with some browsers.
Now try opening https://localhost:8443. It should display the tomcat’s home page. If it is not installed, try running some other application that you have deployed with https.

For more details you can view the SSL documentation that comes with the tomcat on your local machine or you can logon to the website http://localhost:8080/tomcat-docs/ssl-howto.html.

E ñ j Ô ÿ !!!