Friday, October 1, 2010

Interesting color problems...

Its been a while since I posted here - mostly because not much is happening right now business-wise in the industry (I think people are biding their time waiting to see what happens in the economy).

At any rate here's an interesting color story....

A few months back I get a call from a customer - its the operations folks down on the shop floor.  They tell me that the RIP is failing to properly process the files because there are "ghosting" and "shadows" around the fine strokes on small black text.  They say this is an on-going problem and its happened before and could I help them fix the RIP...

Only one problem with this - we didn't sell them any RIPs...

So I get on the phone and talk to the operator - always a challenge for a variety of politically incorrect reasons.  Eventually I tease out of her the specifics - fine black lines in images have halos of C, M and Y around them - basically as reported.  So I ask - is this on all the work on some of the work - only on one type of job on one kind of logo is the reply.

So I stop and think for a minute: A) they are not our RIPs, B) we have a process that creates image content, C) there is a workflow to prevent CMYK black.  This particular system has been running live for at least a decade in one form or another so I am not about to dive in and start changing any settings without some very careful thought.

The RIPs in this configuration pass CMYK as-is, i.e., don't remove CMYK black, because sometimes there is need for it.  So turning it off in the RIP will invite untold production disaster fifty midnights from now.  Plus its been set this way for a decade and so far this is the only problem we have been called on.  My buddy checks the RIP logs and we see no signs of tom-foolery with settings, etc.

Next I get hold of a logo that's causing the problems in source form, i.e., a PDF.  Sure enough there's CMYK-black tiny text (six point).  So now we've tracked down why the press is doing what it does.

So I call the vendor of the press and explain to the tech how its not really their issue (save for any alignment issues they may have on their own).  He doesn't know enough about us to understand we didn't sell the RIP so he's been thrashing for a day or two trying to fix something that's not really broken.

Since the actual logo creation processes is out of our hands I talk to the big boss and explain what's wrong.

Fast forward to today...

I get an email from the press vendor explaining a service call for the exact same issue has come up.

This time we can save an airfare and a few days on site diagnosing nothing because the new guy from the press vendor is on the ball. 

Monday, September 13, 2010

The wonder of transpromo...

We are talking about this over a Lone Wolf today.

AFP plays (and has played) a role here.  I think that the "printing industry" is, or, rather, is trying to co-opt this.

For me AFP content creators are in a much better position to "own" this industry....

Wednesday, September 8, 2010

PPDs, Clouds and AFP...

Over on the Lone Wolf I have been talking about the Google "Cloud Printing" nonsense.  Part of this discussion was a mention of PPDs.  In my past PDF-based life, particularly in the age of Windows PPD's caused no end of problems related to generating incorrect output.

I am not going to go into the details of PPDs here or the specific problems they caused.  Instead I am going ponder briefly about their function in AFP - if any.  On Windows a PPD told the printer driver what the paper size choices might be, what the print device capabilities were, and so on.

At any rate, as for AFP and Cloud Printing.  It would seem that cloud printing would be the antithesis of AFP - particularly from the aspect of ensuring that output was printed correctly. 

So let's take a look a the google interface for cloud printing:

Well, this is informative.

I think its crucial for AFP users like banks and financial services companies to see how well organized and thought out this print process is - particularly in the context of "remote printing" (remote meaning I print from the cell phone or PDA).

I think the most glaring error in all this is the "Printer Errors" path.  Now, let's suppose I am out at some offices, say for example Google's offices, and I have my handy PDA on which is my important contract.  Now, in the cloud world I suppose that I could just wave my PDA around and find a locally willing printer for my job - at least one that isn't too particular about who prints to it.

So, with the printer in hand, so to speak, I print my contract.  But, as is often the case, a paper jam occurs.  Now what?

Well, I imagine that on my PDA will be the small unhappy face (the opposite of the "happy" face linked to the ERROR box).  So my important contract is now laying in some output bin - half printed - for the cleaning people to find.

I guess now I have to grab my GPS (or use my PDA GPS) to find the printer - where ever that might be.  Let's just hope I didn't accidentally print the job in the next state.

Enough sarcasm.

I surely hope Google does not try and enter the commercial printing arena.

PPDs do a bad enough job handling things as simple as output bins and stapling.  One can just imagine the agony of trying to navigate a print-style dialog on a PDA or cell phone.

Now imagine AFP bundled into the mix...

I think its probably safe to assume that AFP and "Cloud Printing" will probably not mix - except perhaps in the heads of marketers - for a long time.

Wednesday, September 1, 2010

AFP and the web...

We are currently working on getting our web servers and browsers to support AFP content, i.e., you can go onto a web site and if you have the right plug-in loaded in your browser you can directly view AFP that's linked.

So we need to get MIME type of 'application/afp' working I guess - we use IIS so there is no doubt some unpleasant nonsense involved with that.

We have been working on some demo web stuff and we have the various browser plug-ins working so now we need to get them working together.

My hope is to announce this all soon in some public way but I would like to have this working before hand.

Tuesday, August 31, 2010

More formal specs...

We have been working in the background here (despite various distractions) on the detailed specifications for the AFP product AFX Raster Pro:

AFX Raster Pro Specifications

APX Raster Pro is available on Apple OSX/Linux/Windows as a command line server application.  It supports simultaneous conversion of files between standard formats such as PDF, PostScript, JPEG, TIFF, PSEG IOCA.  Supports conversion of other file types via any third party raster application that can produce TIFF files as output (Apago Piktor, GhostScript, ImageMagick, etc.).
  • Support is standardized for AFP at 240, 300, and 600 dpi - though any resolution can be supported.
  • Support for other file formats is available at any resolution.
  • Output images may be automatically rotated 0, 90, 180, and 270 degrees (clockwise or counter-clockwise) during conversion and output files may support multiple rotations within the same file.
  • Image transparency is supported through IOCA image transparency masks and can also be calculated through color-based specifications.
  • All output files can be encoded for color as CMYK or RGB.  Other file dependent color spaces are also supported depending on the format used.
  • Image resizing is support along with multiple algorithms for resizing: linear, gaussian, hamming, and blackman.
  • All output files can be directed to unique output folders based on file type.

Color transformations are supported as follows:
  • APX Raster Pro supports a set of parameterized built-in color transforms (called parametric transforms):
             - RGB, CMYK and GRAY to/from RGB, CMYK,  and
             - GRAY, IOCA YCrCb and YCbCr to CMYK and RGB.
  • Shading functions that support manipulation of intensity of CMYK and RGB grey.
  • Shade-based conversions to manipulate the intensity of all color shades.
  • Color specification as percentage, decimal or fractions.
  • Color processing may be applied to images (icmyk), strokes (cmyk), filles (CMYK) in any combination.
  • Color-table-based conversions, i.e., icmyk*(21#,93#,255#,0#)=>cmyk(1,0,0,0).
  • Parametric transforms can be applied over specific sets of colors and named as Color Procs.
  • Color procs can be prioritized in a specific order for application to images.  Color procs can also be triggered based on file extension or type.
  • When multiple color procs are stacked support is provided to ensure that any given input color value is affect by exactly one color proc.

Monday, August 30, 2010

Wednesday, August 25, 2010

Tweaking color...

We have a system that converts PDF to AFP.  Since we any commercial tool to convert the PDF to a raster some of them produce different results for PDF gray.

In general we require that customers always create CMYK images for projects destined to AFP because some AFP devices only support required image features, such as transparency, in CMYK.  Now customers don't always do what they are told so sometimes people do things like create a PDF Gray in a file instead of a CMYK Gray.

So what's the difference?  If you create a document in CMYK then you can generally rely on the creation tools to put the colors you define as CMYK colors into the file in a predictable way.  This means that if I say I want CMYK(23%, 10%, 5%, 6%) that's what I will get.  So if I say I want CMYK(0,0,0,50%) then I should get 50% black and no other colors.

Gray, on the other hand, is different.  While you might imagine that Gray(50%) is the same as CMYK(0,0,0,50%) it may not be - depending on the tools you are using.  Further, the AFP IOCA model does not have direct support for gray-only or black and white images (it only supports RGB, CMYK, YCrCb and YCbCr). 

Without getting into very complex color issues we can say that there are various methods for producing gray on a CMYK device.  One method is to use only CMY in approximately equal amounts to simulate gray.  Another method is to use CMYK and to vary the proportion of K inversely to C, M, and Y in order to render part of the image with K and part with CMY.  A third option is to use only K as if it where Gray and ignore CMY.

Additionally device ICC profiling further affects this by doing its own conversions between gray and various gray representations.

So rendering gray as an AFP color is driven, in this situation, by a number of factors:

  1. Device registration.
  2. Choice of representation for gray during AFP conversion.
  3. Device representation of an AFP gray representation.

So for my current application #1 is likely to take priority because when printing very small type errors in device registration between the colors create low-quality output.

So what we do is create a color transformation for AFP Raster Pro that tells it to convert shades of CMY gray to K black which eliminates the need for registration issues.

Tuesday, August 24, 2010

I guess I am on the right track...

I received an email for a product called AFP Tuner from MakeAFP PTE LTD.

I don't know where they are from - somewhere in Asia I believe - but their product includes one of the elements I am working on: "Replaces legacy AFP page segments with JPEG/TIFF/GIF color images."

I suppose this is good news in a couple of ways.

First off it validates the idea of basic AFP tuning (there are corresponding PDF tuning products as well).  Somewhere, half way around the world, someone has the same idea.  The bulk of this product focuses on other aspects of AFP like font substitution and there doesn't seem to be a parallelization aspect - but I am pleased over all to see other like-minded companies out there.

Secondly, they don't seem real interested in what you can do with color.  "Legacy AFP page segments" means ugly 1-bit images so fixing them is not hard or interesting.  Making them look good, on the other hand, is a different matter.

One interesting idea they present is this: "Replaces legacy shading patterns or shading images with vector color or gray background graphics..."  Basically this involves identifying "gray boxes" of various sorts that have been constructed with images and replacing them with graphic commands to draw and fill a box.  Others have asked me about this feature and I am very close to being able to provide it.

This offering is also a "command line" style application based on the information they provide.

I think that in the long run there will need to be a UI plus an underlying command line application to make this type of tool easier to deal with for end-users.

AFP vs PPML

I have been experimenting with AFP external resources (see previous posts).

It seems clear that this is a very powerful model for handling what PPML calls "reusable objects" - basically anything you want to reuse is placed in a resource package at the front of the job.  Each resource is given a unique name.  Inside the job when you wish to access the resource on a given page you map it into the page's environment and then place the object.

AFP only offers rotation, simple positioning, some limited scaling, clipping and other functions as opposed to PPMLs full CTM model.  But its an effective model and fairly easy to use.

So at least from a "functional equivalence" perspective I can address some print capabilities requiring this type of function.

Monday, August 23, 2010

After some experimentation...

I have been fooling around with various means of doing external image resources.  Basically I think that, for the most part, the viewers do not support it very well.  Pasting the same IOCA image data into the page directly works just fine in most AFP viewers.

I an still waiting on some technical support for a couple of the viewers to see if that clears things up.

I also started working on how well AFP works when stitching together parts of an image.

For this I broke a larger image into parts and set up some AFP pages to stitch them.  To do this I set the resolution of the page and image to be the same, in this case 600 dpi, and then placed the images next to each other.  This seems to work well in all the various viewers.

It is my as yet unproven belief that AFP, unlike PDF, allows devices specific pixel alignment and placement.  PDF is, as you may or may not know, totally device independent in that regard and, further, you are not allowed to "know" in the PDF code about the device resolution.  PDF images have a specific resolution at which they are defined but they are always transformed via a CTM prior to display.  Since CTMs are based on floating point numbers there is no guarantee that things will line up evenly on the display device.

AFP seems very focused on specific image and device resolutions.  So far I have no reason to believe that it does not allow stitching of images and so forth at the device pixel level.  There is no CTM-style scaling for IOCA images so I don't expect much problem.

A note on the viewers.

There are two main types - browser plug-ins and stand-alone applications.  Both IBM and ISIS offer browser-based viewers.  I have been using both for the last several days.  IBM also offers a stand-alone application.

The ISIS viewer offers the most impressive display capabilities so far.  I really like its anti-aliasing capabilities for display and its fast, smooth scrolling and scaling.

The IBM viewer is very tolerant of errant AFP and supports external resources as you would expect.  IBM set up a joint venture called InfoPrint with its printer division and Ricoh in 2007.  So far all of the AFP support still appears to be on the IBM site.

I guess the real question is how all of this will print.  So far I have not worried too much about that but that will no doubt be a source of misery very soon.

Friday, August 20, 2010

AFP Viewers...

Now that I am creating full AFP files I have started experimenting with various "free" or "demo" AFP viewers. Some simple googling will turn up a number of them:

IBM

ISIS

Compulsive Coder

CreDo

More here... 

Of course, there are many AFP commercial products available from some of these same companies as well as from companies like GMC, Elixir, Barr, and many, many more.

My experience so far playing around with some of these tools has been, to say the least, mixed.

For me there are two perspectives: a novice AFP user and an experienced software developer.

As a novice AFP user I can only equate my experience to my novice PDF user experiences from around 1998 and 1999.  At that time Adobe had just released Acrobat 3.0 and most of what I did with Acrobat started with that version.

As a software developer I am familiar with studying manuals and documents to determine how to create software and output, in this case AFP, that conforms.  I am familiar with building tools to check my own work and to validate it.  I am also familiar with discovering what is "missing" from the manuals and standards by experimentation.

So what's important as a novice?  Well, for me I'd like a tool that was simple and reliable and did what it was documented to do.  For the most part all the tools I have been looking at, at least on the surface, do this.  I like a tool that fits naturally with the AFP environment, i.e., something that's not a chore to deal with.

So, from this perspective, the first area of excitement I found was the notion of external AFP resources.  The idea, at least from what I can see, in AFP is that AFP print jobs can be split into two components: a set of resources and a job that uses them.  The resources (things like Page Segments, Images, and so forth) all have names that can be referenced in the job.   The idea is that you can separately transmit resources to a printer and then multiple jobs that reference them in order to save rasterizing and RIP time.

The resources can also be prepended to the job file so that they are "part of the job" in that a single transmission of resources and job together provide a complete definition.

The AFP manuals provide a complete and detailed description of what is supposed to work and how in this regard.  Each of the tools I have played with has a mechanism to support this.  Basically they all allow you to specify a directory where "external resources" can reside.  Jobs you present to the software do not need to have all the assets embedded and, when a reference to an asset is found that's not directly in the job stream, the directory is searched for the missing asset.

(Note to anyone following this blog - if your software is mentioned directly here - or you would be interested in me using your software and writing about it - please let me know.  I will promise that if I write about it here I will always give you a chance to respond to my findings before I write about them here - considering I may be doing something "wrong" as a novice.)

So far I have had a variety of different experiences with the external resource function.  Something software works as advertised and others display a variety of interesting issues - most notably either crashing or displaying nothing.

For example, one would imagine that attaching the resources to the job versus referencing the resources from a directory wouldn't make a difference.

Wednesday, August 18, 2010

A real AFP page...

So after much work and toil my AFP application is now working as a true AFP creation application.

Most of the previous work has been related to creation of Page Segments (PSEGs) which are images or graphic objects.  Work has been centered around creating images in this format, performing color transformations, and so on.

The creation of actual AFP pages is really not a large step beyond this but in terms of remade AFP it is nice to finally see months of work validated.  The remake system currently supports altering PSEGs and other AFP constructs in existing AFP files.

The thing I am currently interested in is creating AFP pages that use PSEG images - both as a way to work with new AFP pages and as a platform for my new PDF/AFP compression model.

So my first task was to figure out what the "simplest" valid AFP file might be with a single image.  It turns out not be be that complex.  Basically you can think of the file like this:

     (BEGIN_RESOURCE_GROUP rt='BRG' x='D3A8C6' name='LXG00000' )
     (INSERT  id='821321_XXXX' /)
        (END_RESOURCE_GROUP rt='ERG' x='D3A9C6' name='LXG00000' /)
      (/BEGIN_RESOURCE_GROUP)

      (BEGIN_DOCUMENT rt='BDT' x='D3A8A8)
          (BEGIN_PAGE rt='BPG' x='D3A8AF')
            (BEGIN_ACTIVE_ENVIRONMENT_GROUP rt='BAG' x='D3A8C9' /)
          (MAP_PAGE_SEGMENT rt='MPS' name001='821321  ' x='D3B15F'/)
          (PAGE_DESCRIPTOR rt='PGD' XpgBase='00' YpgBase='00' XpgUnits='14400' YpgUnits='14400' XpgSize='11880' YpgSize='15840'  x='D3A6AF'/)
              (END_ACTIVE_ENVIRONMENT_GROUP rt='EAG' x='D3A9C9' /)
            (/BEGIN_ACTIVE_ENVIRONMENT_GROUP)
        (INCLUDE_PAGE_SEGMENT rt='IPS' x='D3AF5F' name='821321  ' XpsOset='400' YpsOset='700' /)
        (INCLUDE_PAGE_SEGMENT rt='IPS' x='D3AF5F' name='821321  ' XpsOset='800' YpsOset='1500' /)
            (END_PAGE rt='EPG' x='D3A9AF' /)
          (/BEGIN_PAGE)
        (END_DOCUMENT rt='EDT' x='D3A9A8' /)
      (/BEGIN_DOCUMENT)


The file consists of two parts (using simple non-XML XML so blogger won't choke): a resource which is the image to display and a document with a single page that displays that image.  The image is actual inserted from another AFP file (which we will talk about in another post) - but basically its the wholesale insertion of an AFP PSEG.  This resource could already be stored on the printer as a reusable object but we include it to make the AFP file completely self defining.  The PSEG has a name associated with it which is used by subsequent AFP to identify it.

The AFP document is quite straightforward and consists of a BEGIN/END document pair of AFP records.  Inside this pair is a BEGIN/END page to define the actual page.

In AFP an environment group (here enclosed by a BEGIN/END environment pair) appears immediately after the BEGIN page definition to describe the layout (height, width, resolution, included resources, and so on) of the page.  In this case two AFP records do this: a page descriptor (PGD) and a map page segment (MPS).

The page content is simply to include page segment records (IPS) that reference the image by name and indicate a position on the page to place it.

Monday, August 16, 2010

Data Driven Color Debugging... (part 3)

This falls along the same lines as dealing with any other data driven aspect of industrial printing.

First, you have to set the process up correctly.  This involves several steps:
  1. Identifying what needs to be change and why.
  2. Determining what color changes to apply.
  3. Testing the color changes relative to color approval.
  4. Testing the data aspects the determine the color changes.
The key difference here between testing data driven color and, say, data driven content is that three elements are involved in the color.  First, you have to pick the right color to change and making sure the color change process recognizes the appropriate shades, etc.  Second, you have to make sure what is produced passes all approvals.  Third you have to make sure that data you need to recognize when to apply the color change is present.

In general not too different than any other color work save for step #3.

Our general model for #3 has been to give transforms textual names, e.g., "LightGray1", and to match job identifiers with a table of transforms.  This allows a human to quickly determine what transforms go with what work.  The second element is to attach metadata to the job in order to trigger the proper transform group.  We do this with TLEs (or PDF Bookmarks).  Each TLE or bookmark describes a span of pages and links that span of pages to a set of transforms.

Having this data embedded in the document makes debugging and tracking problems straightforward.

On the application side pdfExpress XM and APX Raster Pro report what transforms are applied by page range and bookmark as the transformations occur.  This allows support personnel to quickly determine if the proper transforms are being applied.

Most imaging processes we are involved in already support some form of metadata per page for other reasons, e.g., mailing or mag strip encoding, so adding additional information for color support is not an issue.

On the debugging side, given this information, its not too hard to see if the proper transforms trigger simply by inspecting the log.  However, that in and of itself may not be adequate.  Sometimes the output will be wrong.

Typically "wrong output" will come from one of a few sources: 
  • Changes to input that were not tested.
  • Lack of full spectrum testing on job setup.
  • Programming errors.
Input changes, or content creep as we described previously, are common.   The CSR end of the business has to be made aware that when customers supplying new content art there may be workflow impact - especially if the change involves color, e.g., a logo change.  This is an organizational issue which must be addressed.

Testing failure is inevitable and is due to a number of issues.  Many programmers are basically unaware of color and how it works and make wrong assumptions - particularly when coding color related functions into the workflow. 

Another issues is lack of coverage.  A customer may be supplying dozens of logos - all with slightly different incorrect colors.  You have to make sure that you identify all the logos in question for correction - not just the first few you encounter.

You must also look for interactions between color transforms, e.g., if I am changing a gray to another color and I am also changing the shade of another, similar gray I don't want there to be a bad interaction.   This requires careful output inspection.

Programming errors are usually not found until real data is provided.  Test data will often cover only what a programmer expects to test and not the full range of real world issues.

Friday, August 13, 2010

Debugging Color... (cont.)

(Continued from the PDF Outsider because the same issues apply here...)

So once there is a problem it has to be sorted out.

Many times the source of the problem is hard to identify.  For example, proofs are always produced with a new customer is taken on and the customer sees something and approves it.  It turns out to be quite difficult to manage color in that context if equipment and workflow is changing.  Remember there might be a dozen steps along the way in the process.  Each step involves its own version of software and hardware.  Since these jobs run over time (months or years) various elements get upgraded and changed along the way - perhaps by another silo in the organization unrelated to production.  We call this "configuration creep".

Configuration creep is significant problem in large companies with multiple print devices and multiple plants and debugging color relative to it can be challenging because it may not be possible to "go back" to the approval configuration.  For example, a conversion element is upgraded and some new default converts color to RGB instead of CMYK - either intentionally or inadvertently.

Configuration creep is the nightmare of any production manager because its totally out of his control yet it can completely stop production.  Debugging this problem is horrific because you have to trace back and determine what, if anything, might have changed.

Another debugging area is what we call "content creep".

Content creep is process by which elements of a large, complex production jobs change over time.  This is particularly important in workflows that involve cached assets, i.e., VDP workflows with PPML or AFP.  In this context assets of various sorts are used in jobs but are not organized as part of the job.  What I mean by this in AFP you can reference external job elements that are not part of the per job per se.  In the print job there is a reference to element - typically cached in the printer - rather than the actual element.  There are analogous scenarios for PDF (PPML).

So as long the elements don't change everything works.  But what happens when a logo referenced in dozens of jobs changes?  For example, I am holding output from a job and it contains the new logo.  Is this particular job supposed to use the new logo?  Large workflows generally try and control content creep with elaborate asset management processes.  But these are only as good as the people who use them and quite often mistakes are made along the way.  Sometimes the shop floor personnel don't know that the asset should change and report problems that are not problems at all.

Another interesting debugging issue follows along these lines:  A customer calls and says that there is "ghosting" around small type on some specific jobs.  After much anguish it is determined that what's happening is that on some parts of the job around 6pt type the device is not registering colors accurately and CMYK black is causing a problem.  So here you have to debug the hardware first and determine if its working correctly.  Given that it is you next have to figure out how the CMYK black got into the workflow.  Since jobs come and go over the course of a year you may discover that someone working on the job last year created a bad asset.  Of course CMYK is not allowed but that doesn't mean people don't find creative ways to inject it into the workflow.

You have to look for everything from a new asset sent by a customer that bypassed checking (somebody was in a hurry to get the new asset into production and just assumed it was correct) to a personnel change and someone forgot to follow the standard asset management steps.

Next post we will cover data driven color problems...

Wednesday, August 11, 2010

"Remaking" AFP

I have been working on my AFP "remake" engine.

The engine involves three parts.  There is a "scanner" which processes an AFP file looking for items we are interested, e.g., embedded color or objects like page segments.  The scanner emits an "stitch list" that tells subsequent passes what we would like to change in the AFP structure.  In addition the scanner extracts the parts of the AFP file into sub-files and passes them on to a parallel array of processors to be processed. 

The parallel array of processors have the second part, the "element processor" available to crunch the pieces.  The element processors have knowledge of what needs to be done to the pieces of the file, i.e., apply a specific color transform, and can recognize the pieces of AFP and their structure.  The read in the AFP, alter it, and write out a new version.

The last part is the "stitcher".  The stitcher waits around for the parts of the file to complete processing and then reassembles the AFP file from the "stitch list".  The stitch list tells the stitcher what its waiting for so every once and a while it wakes up, looks to see if the parts are done, and, if they are, does its job.

The idea for this comes from an existing Lexigraph product called Krypton which works in a similar fashion.  Though for PDF we don't break the PDF apart to process it - most of the parallelization comes from processing pre-existing pieces of PDF into a larger aggregated PDF file.  So this idea works - its been in production in Asia for many years at this point.

This if much more efficient than a "single pass" model where a given application would run through a single AFP file - particularly on today's multi-core servers.

Tuesday, August 10, 2010

Compressing AFP...

I have become interested in a scheme for compressing PDF into an AFP stream.

This would be useful for applying color transforms "up front" of the conversion to AFP (rather than in AFP or out of AFP).

The idea is along the lines of this.

However, in the world of PDF documents some changes would be necessary.  First off, the data being examined has somewhat different properties than the images FITSIO is trying to compress.  Sequential images of planets and gas nebula are unlike sequential images of bank statements.   Another difference is that we can assume basically unlimited CPU/disk for parallelization.  Finally, there is no "transmission" requirement to send the images long distances via radio.

Though AFP supports tiling directly as an IOCA image construct my feeling is that its not a commonly used construct and that making the tiles more general, i.e., full IOCA images on their own, would be a much better idea.

Another element of this is reuse of tiles.  Business documents tend to be constructed from templates with a long-running stream of pages, i.e., a mail stream, containing a small number of templates.  Within each type of template individual changes occupy a relatively small portion of the document.

The only catch is that you have to be able to quickly determine the reuse level of each tile...

Wednesday, August 4, 2010

Not to be forgotten...

As I said I've been posting over in the Lone Wolf blog.

At this point I am almost through the discussion of how the color transformation process functions - so that will be out of the way.

As for our product APX Raster Pro and AFP - the combination of these is currently functioning - though with a slightly less general set of transforms than described on Lone Wolf.

At this point we are actively looking for Beta testers and such for this AFP product.

I hope to finish up the Lone Wolf stuff within a week or two and return to this part of the blog.

Tuesday, July 27, 2010

New Color Management Technology to be Released...

I think this will be of great interest.

follow it here...

The core technology will be discussed on Lone Wolf as it is not AFP specific.

Color in AFP

AFP has a number of interesting places that it hides color.

Most obvious from what I have posted so far are images. AFP does not support as elaborate a model as PDF but there are still many places you can put color.

In AFP there are some 150 or so types of records. Each type of record does something unique and specific. Within these record types there are several more sub-record types. I divide the main AFP record types up into different categories based on my goals. For example, for color processing Page Segments (PSEGs) are used to hold images, bar codes, and graphics. One Page Segement Begin/End record pair can bracket up to a few dozen or so other AFP record types - mostly related to specifying things about height, width, resolution, and so on. Within each sub category, e.g., graphics, there are record types that hold sub-records, e.g., for drawing a box, stroking a line, etc.

So in terms of color you need to be able to find the PSEGs in an AFP file and them delve into them to find their sub-parts. Only at that point can you then think about processing color.

Let's consider the GOCA sub-record type in PSEGs. Color is manifest in a couple of record types related to color, here we will consider GSPCOL because its the most general. This is basically a "set color" operator. It supports RGB, CMYK, LAB and a few other types of color.

So to find and change a GOCA GSPCOL here's basically what you have to do:

- Scan the AFP file until you find a Begin/End pair of Page Segment records. These two record types bound all GOCA - though there may not be any GOCA in them.

- Scan the AFP records in the PSEG and look for Begin/End Graphics. If we don't find any we are done.

- If we find some, then we have to find the AFP records containing the sub-record GOCA commands.

- We then have to scan these sub records looking for GOCA GSPCOL record types.

- When we find a GSPCOL when then check it for color to determine what we should do with in.

Images work in a similar fashion but all the different. In the case of image there are also the actual image raster to consider as well, i.e., in GOCA you can change the color of a box with GSPCOL in a GOCA AFP record. For images you have to do basically the same, but also change the actual raster.

Unfortunately each major color carrying AFP record type, e.g., PTX, has a similar yet unique process required for it.

Sunday, July 25, 2010

The road to an AFP product...

So my customer has a very large license from a European AFP software vendor. They've had this license for at least 10 years and they use it to do both original mailing work as well as "convert" jobs.

A converted job is a job where they receive a set of pages from a customer along with mailing data. The use their AFP software to create "mailing labels" that overprint the page's original mailing information with new, presorted information. The converted jobs are discussed more in my PDF Outsider blog.

The in-house composition jobs involve receiving customer logos that must be processed in AFP. Fortunately or unfortunately the logo data arrives as "traditional" Mac file formats such as Quark, Photoshop, etc. As a result these logos must be converted to AFP via a conversion tool. The tool provides several options to convert PDF and PostScript to an AFP "Page Segment" - which is basically an image.

This customer suffers with a number of peculiar problems. First of all there are various versions of the converter software. Some create Page Segments (PSEGs) with certain types of compress, others with different types. Some versions use lossy compresion, others don't. Of course there are various options to control these things but either they don't work or the customer can't get them to work. At the end of the day they suffer constantly with bad PSEG files.

The customer has this all hooked up with various other tools from us and other vendors to create a series of PSEG files with various rotations (AFP lacks a general purpose rotation operator for PSEGs). So we were able to bypass all of the spit and bailing wire and create a single application to convert PDF and PS directly to the proper set of rotated PSEG files.

The critical issues for this tool where 1) convet with LZW compress (no data loss), 2) provide alpha channel data based on color (white in this case is white), 3) emit a series of TIF files (rotated at 0, 90, 180 and 270), and 4) emit a PDF with four pages having the same rotations as in #3. After some work I was able to create a tool to handle the AFP conversions for images and combine this with some of our existing technology to create a conversion product.

We gave them this tool, set up their hot folders, etc. and now they are starting to use it. Since we understand this entire process end-to-end now we are able to give them much better support than their other vendor. For example, they called up and said "the transparency is not working for this PDF". It turned out their other AFP application was using the wrong version and failed to emit transparency (this is hard to imagine for a mature product). Now they are calling us about all sorts of AFP problems...

So it occurs to me that they can't be the only ones having these problems and I start to sketch out an AFP product to help them.

Friday, July 23, 2010

Whate else makes AFP more interesting than PDF...

Historically it appears that APF constructs such as text positioning and images were much more closely tied to the resolution of the device. Early devices were 240 dpi with single on/off pixel resolution. It was not until the introduction of GOCA, for example, that true arbitrary text positioning became possible.

Modern AFP devices today have much higher resolution (360/720 dpi) and color. AFP color also supports a fixed OCA space containing what were probably single highlight colors, e.g., there is the notion of RED and BLUE with separate, non-CMYK/non-RGB/... color. These probably came into play when highlight color machines were popular in the 1990s.

I believe that early AFP was designed for speed over portability given the notions of device resolution bitmap images and raster fonts. All the RIP had to do was move bits around. However, as PDF and PostScript began to encroach in the AFP world these notions were sacrificed.

What is not discussed with AFP is RIP performance. The reason, I think, is that historically jobs are effectively RIPed by the AFP producer: fonts are raster and any images that were used were supplied as "ready to image" in the proper device resolution.

However, with GOCA the printing engine now has to do the same work as a PostScript or PDF RIP. (Further I believe that PDF and PS images can be embedded in MO:DCA or IOCA somehow - how I don't know yet - leading to potential future RIP performance issues).

AFP is not as "smoothly consistent" as PDF. For example, transparency masks appear to work with certain types of image compression but not others. There are a myriad of image compression options but each Function Set only supports certain types in certain situations. These issues are laid out in the various manuals but their existence does not make life easier.

There are also device issues. For example, certain devices cannot handle AFP files with mixed image resolutions.

On the other hand AFP got the notion of external job resources right. Like PPML or VDX (which I was personally involved with for a while during its birth) there is the notion of external assets referenced in a job. Assets can be attached to the job or called out from a pool on the print device. I would imagine that there is a lot of IBM software to support managing this - though I am not completely certain of this because I recall overhearing some comments to the opposite a few years ago. (Lexigraph sells a large, powerful system for managing RIPed assets call RMX.)

All this said if you take a step back from AFP you run into the next set of issues: tool sets. The PS/PDF world has long had Photoshop, Illustrator, etc. to provide a means to manipulate images, vector art, and so on. While there are many "mail merge" type tools for AFP there seems to be a significant lacking in this area which is why I got involved...

Thursday, July 22, 2010

A little AFP background...

I don't know the full history of AFP but mostly it stems from the IBM 3800 printer days - basically this was one of the first laser printers for industrial use (another being the Xerox 9700). (In those days a "fuser-wrap" paper jam on the 9700 required an 18" crescent wrench to fix - but I digress.)

The most current information available for AFP, and, in particular, color AFP can be found at Output Links. This site contains most of the documentation you will need to understand to deal with AFP - though even "modern" AFP contains elements which I cannot find direct documentation for, e.g., IM Images.

From the perspective of a programmer or technician AFP is very different than PostScript or PDF. First and foremost AFP is record based. All of the commands and images are fit into variable length records up to 32K in length. Second, AFP uses a "wrapper model" to support extensions, i.e., an AFP X'5A' record "wraps" sub elements in their respective sub-records. So, for example, there is a sub language of records to describe images called IOCA. At the top level an AFP X'5A' record may contain multiple IOCA image sub-records.

To make life a little simpler for anyone following this have broken the AFP structure down below:

Image - AFP uses a sub-language called IOCA to describe images. Basically this is like image descriptions in PDF or PostScript - you have various color spaces (CMYK, RGB, LAB, ...), profiles, transparency masks, and the like.

Vector - AFP uses a sub-language called GOCA to describe this. GOCA has strokes, fills, winding, etc. along with full color specification.

Text - Things are a little more interesting here. AFP supports a variety of font types - some modern and some not. On the modern side there is support of Unicode and OpenType. On the not so modern side there are bitmap image fonts. To further complicate things there are multiple notions for placing characters on the page.

Text can be place via traditional line-printer-like commands, via PTOCA (Presentation Text Objects), or via GOCA. Of course, each of these models works differently.

The biggest difference between AFP and something like PDF is that characters cannot be directly and simply "outlined" nor "rotated arbitrarily". While you can "do" all the same things in AFP as you can in PDF - it can be (much) more complex to accomplish.

Another important issue is printer capability. All color AFP devices are designated with "Function Set 45". There are older, less capable printers which can do simply black and white images, and those with even less ability. We won't consider those other than Function Set 45 here.

There are many commercial tools to create AFP from most of the major mainframe-type software vendors. Some vendors support both PDF-type and AFP-type languages.

IPDS - This is a relative of AFP that is used by many high-speed inkjet devices. While the language is different IPDS supports AFP IOCA images so anything discussed here related to IOCA also applies to IPDS.

Wednesday, July 21, 2010

The road to being an AFP Outsider...

Since blog sites don't have nice neat ways to create categories I have created this blog to track my thoughts about AFP. AFP (Advanced Function Presentation/Print) was created by IBM for printing. It is being dragged into the 21st century as a color print medium. This blog is to relate my experience in creating an AFP product. (There is a companion blog at The PDF Outsider discussing my feelings about PDF.) This is also linked to my The "Lone Wolf" Graphic Arts Technologist blog.

I am applying my knowledge of PDF and color to AFP.

So why will I be the AFP Outsider?

Primarily because I don't prescribe to the associated AFP dogma, I tell customers the truth, and vendors of AFP products mostly don't like that.

When my customers have problems with AFP and AFP printers I want to solve them irrespective of any associated dogma.

I will document my journey here (which has only just begun since the beginning of 2010).

This blog is Copyright (C) Todd R. Kueny, Sr.