Create Reports with Scribus & Drupal (Part 2)
In the previous examples we used the integrated Python scripting extension of Scribus to create reports. We accessed a MySQL database directly, but that situation is just not normally given on the Internet. More often, the database is hidden behind a CMS of some sort. While my experiences come from using Drupal, this should not depend on any specific CMS.
Now, to get the data from the database completely formatted into Scribus we either have to produce a complete SLA file or provide an interim data file that is then again sourced with a Python script in Scribus. To make it easiest for users, simply downloading an SLA file and opening it would be the preferred solution.
The ugly hack variant
Thankfully, Scribus files are a documented XML format (optionally gzipped). However, creating files from scratch is not really an option because invalid files fail silently and validity is often not just a question of being well-formed. With 1.3.6 this seems to have gotten a lot easier since Scribus now pops up a warning with a parse error and a line and column number. (If you want to test it, just insert a plain ampersand into ITEXT's CH.)
Even with that it might no be obvious why a text frame would necessitate all of a dozen parameters to display thus creating files from scratch is not really an option. So, since one probably does not have the time to study that XML format we'll just cut and paste. This works especially well in the first example where we have a single page and simply want to fill in values.
Coversheet
What you intend to cut up and modify are pageobjects. If you open up your SLA file you'll see that a pageobject often comes with 10-30 parameters. The only one shown here is ANNAME, since it's the object's name which you can set in the properties panel.
I recommend setting it within Scribus so you can easily find your object in the SLA file. If you leave the default, the parameter will be empty and the Text4 or Text5 that Scribus reports is just incremented based on the objects present.
<PAGEOBJECT ANNAME="LoremA">
<ITEXT FONT="DejaVu Sans" FONTSIZE="20" CH="Lorem Ipsum"/>
<para/>
<ITEXT FONT="DejaVu Sans" FONTSIZE="12" CH="Dolor sit amet"/>
<PageItemAttributes/>
</PAGEOBJECT>
From this structure you should be able to splice your XML file together in such a way that you replace the CH parameter with your variable and you should get a valid SLA file that shows your changes.
However, any special characters in your string are likely to cause problems and have to be properly escaped, so run htmlspecialchars() on it first. Furthermore, if you have HTML rather than plain text, you have to strip out those tags, too. With Drupal, check_markup() and check_plain() are a great place to do the heavy lifting.
If you want to preserve the line breaks that <p></p> allow (and probably <br />, too, in many cases), I would recommend to split the string into several substrings and then later reconnect them. Unless you split your string into several strings in those occasions you will likely get confused where to apply htmlspecialchars() and where to substitute and escape. Afterwards, you can splice them together with:
"/> <para/> <ITEXT CH="
Barcode labels

In this example it becomes patently clear that working with PHP's XML DOM would be the smarter option. I hope to be able to follow this post up with a “Part 2b” sometime in the near future that does exactly that but I'm not betting on it.
We begin by taking the example SLA we have from previous attempts. If you start from scratch, style the document as you like and consider having more than one page to get a feel for page structure.
First, the parameter ANZPAGES (sounds like Denglish to me) of the DOCUMENT element is simply the number of pages. For my example with 30 labels per page I got the number from the views module with:
$pagenums=ceil(count($themed_rows)/30);
Next, after the MASTERPAGE elements you have to define all pages. If you have two pages to compare you should be able to figure out that PAGEYPOS and NUM are the two parameters to increment in PAGE. I don't pretend to even remotely understand that neverending canavs, in my case it just turned out that for letter pages having an initial offset of 20 and then incrementing by 792 per page ( PAGEHEIGHT for letter paper) plus adding 40 did the job.
The PAGEOBJECT elements were a bit frustrating, but at least you don't have to interleave PAGE elements or other fun stuff. First, OwnPage is of course a reference to the page this object belongs on, simply increment until you hit the next page.
The XPOS and YPOS parameters can be filled with the same methodology outlined in Part 1. The WIDTH and HEIGHT parameter are exactly what one expects. The tricky one is POCOOR, though. I have no idea why it is the way it is. I simply noticed that it alternated width and height in the following pattern in my files:
$po="0 0 0 0 $w 0 $w 0 $w 0 $w 0 $w $h $w $h $w $h $w $h 0 $h 0 $h 0 $h 0 $w 0 0 0 0";
Afterword
So, was that even a marginally sensible approach? Maybe not. For my use case I just know this worked as a drop-in replacement to the system in the first part. In the future I might attempt to use OpenDocument instead of SLA (as a very speculative third instance of this series) and in turn use OpenOffice Writer rather than Scribus for this.
I do think that complex workflows are still very much a weakness of many open source web-based collaboration tools and CMS. An example of that is OpenAtrium, you can have a private group, collaboratively author and edit documents, it's lovely to work with but to export the document(s) you are left with copy and paste (my feature request on that). Of course, just having export options will not fix my very specific examples here but I think that it could have simplified things a lot.



