Monday, May 18, 2015

How To Publish Software Artifacts To Central Repository

I have had to release software artifacts to the Central Repository, which used to be mainly referred to as Maven Central a couple of times now, and each time I have found myself fumbling at one step or another in the process, and had to resort to Googling to sort things out. This post is basically a compilation of all my “groping in the dark”, because unfortunately the process of moving the piece of software you have written to having it available for the whole world to download from the Central Repository is not yet a matter of just pushing a button; or running a simple command.

So I write this post to serve as a reference for myself and for anyone else out there that might find it useful: a place to document the necessary steps needed to publish Software artifacts to Central Repository; while iterating the recurrent hurdles that I kept on encountering in the process of getting the publishing to Central Repository successful. I have also included a couple of useful links at the end of the post.

A rundown of the steps involved can be roughly summarized to be:
  1. Create an account on oss.sonatype.org
  2. Create and share a PGP signature.
  3. Update settings.xml and pom.xml appropriately.
  4. Upload your artifact to oss.sonatype.org.
  5. Promote the release.

Saturday, May 16, 2015

Krwkrw 0.1.2 Released.

Just pushed the latest release (0.1.2) of Krwkrw to Maven central.

Krwkrw is a web crawler/scrapper...scrapper is more apt actually.

If using Maven as your build tool, you can add it to your project via:

<dependency>
<groupid>com.blogspot.geekabyte.krwkrw</groupid>
<artifactid>krwler</artifactid>
<version>0.1.2</version>
</dependency>

If using Gradle, then:

dependencies {
    compile "com.blogspot.geekabyte.krwkrw:krwler:0.1.2}"
}

A quick run down of stuffs worth mentioning that comes with this release:
  • Addition of 3 utility classes that makes it easy to store the crawled webpages to a relational database, ElasticSearch or saved into a CSV file.
  • It is now possible to register a callback that would be fired when the crawling operation terminates. Should be most useful when the crawling operation is done with the Async mode. 
  • A bug where broken links are crawled multiple times.
  • General improvements to API, tests, etc...
You can see the Readme for more information. And yeah, the 'a' was dropped from the name, from Krawkraw to Krwkrw, because, all consonants name sounds cooler.

For a background story on how Krwkrw came to be, please read A web scraper/crawler in Java: Krawkraw