198 Lotus blogs updated hourly. Who will post next? Home | Blogs | Search | About 
 
Latest 7 Posts
Side-Project Monday Evening
Tue, Jun 27th 2017 6
Including a Headless DDE Build in a Maven Tree
Tue, Mar 14th 2017 10
That Java Thing, Part 17: My Current XPages Plug-in Dev Environment
Sun, Feb 26th 2017 9
Slides From My Connect 2017 Presentations
Fri, Feb 24th 2017 9
The State of Domino App Dev Post-Connect-2017
Fri, Feb 24th 2017 8
Reforming the Blog in Darwino, Part 2
Thu, Feb 16th 2017 7
Connect 2017 Final Stretch
Wed, Feb 15th 2017 7
Top 10
Setting up nginx in Front of a Domino Server
Thu, Sep 18th 2014 18
How I Maven-ized My Framework
Mon, Dec 8th 2014 15
Things I Rarely Use: sessionScope
Thu, Jul 3rd 2014 13
Working with Rich Text's MIME Structure
Wed, Jul 8th 2015 13
Including a Headless DDE Build in a Maven Tree
Tue, Mar 14th 2017 10
Building an App with the frostillic.us Framework, Part 4
Thu, Jul 17th 2014 9
Delving Into NSF Raw Item Data
Tue, Jul 29th 2014 9
Arbitrary Authentication with an nginx Reverse Proxy
Mon, Sep 22nd 2014 9
Release Weekend: ODA and Darwino
Tue, Aug 2nd 2016 9
Quick XPages Utility: Keep Alive and Alert
Tue, Aug 30th 2016 9


Working with Rich Text's MIME Structure
Twitter Google+ Facebook LinkedIn Addthis Email Gmail Flipboard Reddit Tumblr WhatsApp StumbleUpon Yammer Evernote Delicious
Jesse Gallagher    

My work lately has involved, among other things, processing and creating MIME entities in the format used by Notes for storage as rich text. This structure isn't particularly complicated, but there are some interesting aspects to it that are worth explaining for posterity. Which is to say, myself when I need to do this again. As a quick primer, MIME is a format originally designed for email which has proven generally useful, including for HTTP and, for our needs, internal storage in NSF. Like many things in programming, it is organized as a tree, with each node consisting of a set of headers (generally, things like "Content-Type: text/html"), content, and children. Domino stores the text part of rich text in MIME as HTML. In the simplest case, this ends up a one-element "tree", which you can see in the document's properties dialog: Content-Type: text/html; charset="US-ASCII" Hello there There's slightly more to its full storage implementation (like the MIME_Version item), but the MIME Part items are the important bits. This simple structure can be abstracted to this tree: text/html Things get a little more complicated when you add embedded images and/or attachments. When you do either of those, the MIME grows to multiple items and becomes a multi-node tree. Embedded Images When you add an embedded image in the rich text field, the storage grows to four same-named MIME Part items. Concatenated (and clipped for brevity), the items then look like: Content-Type: multipart/related; boundary="=_related 006CEB9D85257E7C_=" This is a multipart message in MIME format. --=_related 006CEB9D85257E7C_= Content-Type: text/html; charset="US-ASCII" Here's a picture:



Done. --=_related 006CEB9D85257E7C_= Content-Type: image/jpeg Content-ID: <_2_0C1832A80C182E18006CEB9885257E7C> Content-Transfer-Encoding: base64 *snip* --=_related 006CEB9D85257E7C_=-- You can see the same sort of HTML block as before contained in there, but it sprouted a lot of other stuff. To begin with, the starting part turned into "multipart/related". The "multipart" denotes that the top MIME entity has children, and the "related" is used when the children consist of an HTML body and inline images. There are delimiters used to separate each part, using the auto-generated convention of "related" plus an effectively-random number. The image itself is represented as a MIME Part of its own, in this case stored inline and Base64-encoded (it can be shifted off to an attachment by Notes/Domino after a certain size). This structure can be abstracted to: multipart/related text/html image/jpeg The HTML is designed so that there is an image tag that references the attached image using a "cid" URL, an email convention that basically means "find the entity in this related MIME structure with the following content ID" - you can then see the content ID reflected in the JPEG MIME Part. This sort of URL doesn't fly on the web, so anything displaying this field on a web page (or otherwise converting it to a non-MIME storage format) needs to translate that reference to something appropriate for its needs.* Attachments When you have a rich text field with an attachment (in this case without the embedded image), you get a very similar structure: Content-Type: multipart/mixed; boundary="=_mixed 006EBF7C85257E7C_=" This is a multipart message in MIME format. --=_mixed 006EBF7C85257E7C_= Content-Type: text/html; charset="US-ASCII" Here's an attachment:



Done.
--=_mixed 006EBF7C85257E7C_= Content-Type: application/octet-stream; name="cert.cer" Content-Disposition: attachment; filename="cert.cer" Content-Transfer-Encoding: binary cert.cer --=_mixed 006EBF7C85257E7C_=-- The structure is the same sort of tree as previously, but the "related" content sub-type has changed to "mixed". This indicates that there are multiple types of content, but they're conceptually distinct. In any event, the tree looks like: multipart/mixed text/html application/octet-stream "application/octet-stream" is a generic MIME type for, basically, "bag of bytes" - MIME-based tools use it when they either don't know the content type or, as in this case, don't care. In this case, Notes/Domino splits out the content to be an NSF-style attachment and then references that in the MIME - this is an implementation detail, though, as the API returns the value regardless. This also highlights a minor limitation in rich text storage: attachments do not have an inline representation in the HTML, and so they are always moved to the end of the field in Notes. At first, I was peeved by this limitation, but it makes a sort of sense: cid references are really about images, and I guess Lotus didn't want to override that for use in normal link elements. That brings us to the final potential structure you're likely to run across: Embedded Images And Attachments When you include both embedded images and attachments, things get slightly more complicated. I'll skip the raw MIME and go straight to the tree: multipart/mixed multipart/related text/html image/jpeg application/octet-stream So this becomes a combination of the two formats, and a bit of logic emerges. In Notes's structure, "multipart/mixed" always contains two or more children, and the first one is the textual body, whatever form that may take. One of those forms is just a single-part "text/html", and the other is a "multipart/related" subtree containing the "text/html" and one or more images. Once you get a feel for these structures, it makes the task of reading and creating Notes-alike MIME items much less daunting. There are a number of other concerns I've been dealing with as well (such as the conversion of composite-data rich text to HTML and how there are two ways to do it), and maybe I'll make a followup post at some point about those. * As a minor note on this point, it's an area where the Notes client and XPages diverge slightly. The Notes client (which generated the example above), leaves inline images "nameless" - they contain no "Content-Disposition" header and no name in the "Content-Type", instead sticking with just the "Content-ID" for identification. With XPages, however, presumably due to the fact that it has filename information during the upload process, the result still contains (and is referenced by) the "Content-ID" value, but it also contains a line like: Content-Disposition: inline; filename="foo.jpg" This functions the same way for most purposes, but it may be significant. For example, if you happen to write processing code that uses the presence of absence of the "Content-Disposition" header as an indicator of whether it's an attachment or not, knowing this ahead of time could save you a certain amount of headache. The right way to do it is to see if the header is either missing or has a basic value of "inline" instead of "attachment".

---------------------
http://frostillic.us/f.nsf/posts/277E44C94FC9C5D485257E7C0080EF7D
Jul 08, 2015
14 hits



Recent Blog Posts
6
Side-Project Monday Evening
Tue, Jun 27th 2017 1:49p   Jesse Gallagher
Yesterday, in one of my various Slack chats, the topic of JShell - the Java 9 REPL - came up in the context of how useful it would be for XPages development. Being able to open up a "shell" into a running XPages application could be really useful in a lot of ways - and I think that the XPages Debug Toolbar has an SSJS-evaluate feature that would do something like this. Still, it got me looking around a bit, and I ran across Groovysh Server, which is a project that combines Apache's SSH
10
Including a Headless DDE Build in a Maven Tree
Tue, Mar 14th 2017 4:45p   Jesse Gallagher
Most of my Domino projects nowadays have two components: a suite of OSGi plugins/features and at least one NSF. Historically, I've kept the NSF part separate from the OSGi plugin projects - I'll keep the ODP in the repo, but then usually also keep a recent "build" made by copying the database from my dev server, and then include that built version in the result using the Maven Assembly plugin. This works, but it's not quite ideal: part of the benefit of having a Maven project being au
9
That Java Thing, Part 17: My Current XPages Plug-in Dev Environment
Sun, Feb 26th 2017 4:23p   Jesse Gallagher
It's been a while since I started this series on Java development, but I've been meaning for a bit now to crack it back open to discuss my current development setup for plug-ins, since it's changed a bit. The biggest change is that, thanks to Serdar's work on the latest XPages SDK release, I now have Domino running plug-ins from my OS X Eclipse workspace. Previously, I switched between either running on the Mac and doing manual builds or slumming it in Eclipse in Windows. Having just t
9
Slides From My Connect 2017 Presentations
Fri, Feb 24th 2017 9:29p   Jesse Gallagher
At this year's Connect, Philippe Riand and I co-presented two sessions: one on ways to integrate your apps into the Connections UI and one on Darwino's role for Domino developers. I've uploaded the slides to SlideShare: DEV-1430 - IBM Connections Integration: Exploring the Long List of Options DEV-1467 - Give a New Life to Your Notes/Domino Applications and Leverage IBM Bluemix, Watson, & Connections (effectively, "the Darwino session")
8
The State of Domino App Dev Post-Connect-2017
Fri, Feb 24th 2017 9:28p   Jesse Gallagher
I'm en route back from this year's IBM Connect in San Francisco, and this plane ride is giving me a good chance to chew over the implications for Domino developers. First off, I'll put my bias in this matter right up front: Darwino, which I've been working on and discussing quite a bit, is one of the three "chosen" vendors for app enhancement/modernization/what-have-you. So, while this post isn't going to be about Darwino specifically, it's certainly pertinent for me. In any case,
7
Reforming the Blog in Darwino, Part 2
Thu, Feb 16th 2017 8:41p   Jesse Gallagher
During the run-up to Connect next week, I turned my gaze back to my indefinite-term project of reforming this blog in Darwino. When last I left it publicly, I had set up replication between a copy of the database and a Darwino app. After that post, I did a bit of tinkering in the direction of building a (J)Ruby on Rails front-end for it, next to the "j2ee" project. That side effort may bear fruit in time (as I recall, I got the embedded web app serving default pages, but didn't implemen
7
Connect 2017 Final Stretch
Wed, Feb 15th 2017 12:16p   Jesse Gallagher
IBM Connect 2017 is less than a week away, and I've been furiously prepping for a couple parts of what is promising to be a busy conference. On Monday, before the official kickoff of the conference, OpenNTF is co-hosting a Hackathon, where attendees will work on one of several potential projects. The goal is to learn about new development methods, work with new people, and hopefully kick off some useful open-source projects to boot. During the conference proper, I'll be presenting two se
5
December Is Self-Aggrandizement Month, Apparently
Sat, Dec 17th 2016 3:21p   Jesse Gallagher
It's been a busy month (couple of years, really), but the last few weeks in particular have involved a couple minor announcements that I'm quite appreciative for. On the 14th, IBM announced the 2017 class of IBM Champions for ICS, and they included me on the list. It's been a joy to be considered a Champion for the last few years, and 2017 promises to be an interesting year to continue that in our slice of the development world. Mere days later, IBM sent out notifications about Connect
6
The New Podcast is a Real Thing: WTF Tech Episode 1
Mon, Oct 31st 2016 10:31a   Jesse Gallagher
As intimated at the end of the last This Week in Lotus, Stuart, Darren, and I have launched a new podcast in a similar vein: WTF Tech. Since we're all in the IBM sphere, that'll be the natural starting point for the topics we cover, but it's not going to be IBM-focused as such. For this first episode, we lucked out and had a couple-weeks period chock full of announcements, so we had plenty of material. Give it a listen!
5
Cramming Rails Into A Maven Tree
Mon, Sep 26th 2016 1:25p   Jesse Gallagher
Because I'm me, one of the paths I'm investigating for my long-term blog-reformation project is seeing if I can get Ruby on Rails in there. I've been carrying a torch for the language and framework for forever, and so it'd be good to actually write a real thing in it for once. This has been proving to be a very interesting thing to try to do well. Fortunately, the basics of "run Rails in a Java server" have been well worked out: the JRuby variant of the language is top-notch and the




Created and Maintained by Yancy Lent - About - Planet Lotus Blog - Advertising - Mobile Edition