Disclaimer: Since I've rebuilt this blog and saved a few of my previous posts, there may be links/references/tutorials in this post that are out of date and/or no longer exist. This post was originally written on March 18th, 2015.


As a relatively new developer, I have a very long list of things that I want to learn. New frameworks, languages, architectures, technologies...I could go on for days. I also have a somewhat bad habit of reading through resources on a specific language/framework/technology but never actually applying it in a practical project.

With that in mind, I've been paying a lot of attention recently to one of the rising stars in the JVM world, Clojure.

I've read through the wonderful Clojure for the Brave and True (and referenced back to it several times whenever I got lost). I worked through the basic and
intermediate Om framework tutorials. I decided that (for now) web application development in Clojure was just slightly out of reach as I didn't quite understand the Clojure basics. I struggled for a little bit while coming to terms with the "Clojure state of mind". And finally, I decided it was time to just build something.

As part of an ongoing work project, we're in the middle of testing a file transfer/batch processing system that is undergoing a vendor integration. One of the batch processes under my umbrella was written around six or so years ago using Java 1.4 and was in sore need of some TLC. I'd been working through refactoring with the intent of bringing it up to Java 1.7 and adding more clarity for easier support.

Last Friday, I decided to take the afternoon and rewrite the entire thing in Clojure as a little experiment. I wanted to see how easy it would be to replicate the functionality while keeping to the standards that Clojure enforces. And, once rewritten, how much cleaner could it get and how much of a performance increase would I see?

The Rewrite

In its simplest form, this batch process checks for a comma-delimited file and, if present, moves it to a network location for consumption. It then checks for an outbound file, and if present, checks each response, emailing any errors it finds, and places the file on yet another network location for transfer to the vendor. I know, it's not the most technologically-current design or the most up-to-date way of doing things, but it's a great opportunity to get familiar with Clojure basics.

Before I get too far in, I'll list out the dependencies I used in the implementation:

I stubbed out the basic function definitions (four in total) and vars (3 in total) in the core class. I also took some time to research mailer and figure out what configuration I needed to do (there wasn't much). Since the email piece seemed to be the easier of two initial paths, I followed mailer's documentation and came up with the full implementation in one shot (var definition and imports omitted for brevity) :

File: email.clj

(defn email-error
  "Sends an erroneous record in a request file to a select group of recipients"
  [bad-record]
  (info (str "Sending error email to " (:recipients mail-config)))
  (mailer/with-settings {:host (:smtp mail-config)}
                        (mailer/with-delivery-mode (:mode mail-config)
                                                   (mailer/deliver-mail {:from "e@mail.com", :to (:recipients mail-config) :subject "Error during processing"}
                                                                        "templates/record-error.mustache" {:date (l/local-now)
                                                                                                           :record (clojure.string/join bad-record "\t")}))))

There's probably quite a bit of code in there that is redundant (i.e calling mailer/with-delivery-mode after already calling delivery-mode! earlier), but for a first pass, it's functional and helped me get off to a good start.

The implementation of the core functions was fairly boring, although it included a little bit of Java inter-op which was refreshing. I got to deal with some let scoping and do calls, which have always just seemed awkward to me, as well as some file I/O. There was quite a bit of boolean logic inside these functions, so getting my hands dirty with some larger and more complex if and loop/recur statements was fun (aside: I ended up removing the loop/recur, as I couldn't get it to work right in debug).

There was one part that really took me a bit of work to get functional, so I'm detailing both functions below before I walk through them...

File: core.clj

(defn- validate-record
  "Iterates over the records in a request file, using filter to check all lines that return true from the validation function."
  [record]
  (def rec (zipmap [:doc-num :status-date :request-quantiy :origin :destination :ticket-status :coupon-num :ticket-num :airline-code :check-digit] (first record)))
  (if (re-matches #"\d{2}\/\d{2}\/\d{4}\s+\d{2}:\d{2}.\d{3}" (:status-date rec))
    (do
	  (warn (str "Bad record detected; document number" (:doc-num rec)))
	  (email/email-error record)
	  false)
	(do true)))

(defn check-outbound
  []
  (if (.exists (io/file (str (:outbound-path outbound-config) "/" (:outbound-filename outbound-config))))
    (let [status-file (io/file (str (:outbound-path outbound-config) "/" (:outbound-filename outbound-config)))
	      status-destination (io/file (str (:outbound-destination outbound-config) "/TicketStatus" (f/unparse (f/formatter "yyyyMMddHHMMss") (l/local-now)) ".txt"))
		  status-archive (io/file (str (:outbound-archive outbound-config) "/ticketstatus" (f/unparse (f/formatter "yyyyMMddHHMMss") (l/local-now)) ".txt"))]
	  (def split-data (map #(str/split (first %) #"\t") (csv/parse-csv (io/reader status-file))))
	  (->> (filter validate-record split-data)
	       (csv/write-csv status-destination)))))

This particular set of functions and the functionality they represent was possibly the hardest part of this whole experiment. The overall idea was to iterate through each row of the file and retrieve all of the records in a way that would allow them to be processed individually. My first attempt was a relatively naive (csv/parse-csv (io/reader status-file) :delimiter "\t") however it failed spectacularly. After trying it in the REPL, I discovered it wasn't separating each record in the file, but was reading each record as a string and placing it into a vector of vectors, with each child containing only a single string. map was my go-to solution for this, as I was able to succinctly retrieve each child vector, and split the first element (the entire record) into the format I originally expected.

I was pretty excited to clean up a lot of more complicated code through the use of the ->> macro; its functionality is semi-fascinating to me and this was a perfect opportunity to try it out for real.

I ended up getting bit by my validate-record function though. The zipmap call wasn't mapping the keys and values correctly due to some unforeseen impacts tied to whitespace (turns out it breaks on spaces). I'm still kind of caught up here, as the continual calls to first makes me thing there's a better way to handle this. However, all things considered, the email functionality as well as the validate and threading macro worked more than adequately for the tasks at hand.

The only other thing I floundered a bit with was environment-specific configuration files and resources, although I think that may turn into a separate post.

Overall...

Overall I'd say it was a pretty successful rewrite, even if it never makes it past the experiment stage. It was fun to finally sink my teeth into some true Clojure development. IntelliJ plus the Cursive plugin made development move pretty quick, and I'd definitely use the toolkit again. Now that I've got a bit of a taste for Clojure though, I'm definitely going to add it to my tool belt and continue using it whenever the opportunity presents itself.