Fear the Cowboy

Life of Microsoft Open Source Developer

Notes about shared libraries in CoApp

clock April 13, 2010 08:25 by author Garrett Serack

(cross-posted to the mailing list)

Since I’ve been jumping all around the map on answering questions, I wanted to first jump into the heart of what CoApp really fixes, and we’ll work our way out from there.

Libraries (static or dynamic) are the heart and soul of pretty much all software—and open source is no exception. If code didn’t depend on no other code, then packages would be insanely trivial to engineer, we’d just zip up the files and that would be it.\

However, since this isn’t the case, we need to understand what Libraries mean to us, and what we need to ensure to make everything end up shiny.

What CoApp will address:

There must be a common method to access a Shared Library, in a logical consistent fashion

particular version of a library (with a specific binary ABI) must be upgradable to a new compatible version without having to adjust a currently installed application

Multiple versions of a library (with potentially different binary ABIs) must be able to be present at the same time

Multiple compilers must be supported--that is, multiple copies of the same library, the same version and the same ABI, but reliance on a different compiler (and CRT runtime)

Libraries must be installed and upgraded independently of an application

An application must be able to override a system default version of a library if necessary

Shared libraries should always be packaged with their relevant import libraries (.lib) and header files (.h) files

Luckily, Windows provides us a way to do most of that without much difficulty—provided you have tools to automate that.

WinSXS (Windows Side-by-side) technology allows us to install multiple versions of libraries, each tagged with a version (Major.Minor.Revision.Build) , and allows us to build ‘policy’ files that direct programs to use the correct version.  We use manifests with the applications to tell it what version (Major.Minor) it requires, and the policy files direct the EXE to the best match (most of the time, the most recent version in a given Major.Minor set.

Consistency of the Major.Minor versions indicates a binary ABI compatibility. Changing the Major or Minor numbers in effect declares that binary compatibility may not be guaranteed (however, policies can be written to forward older versions if the author claims binary compatibility is still present)

In order to use WinSXS however, all binaries must be signed with an Authenticode code-signing certificate, from a reputable CA (certifying authority).

This signing requirement actually turns out to be the key to supporting multiple compilers at the same time—a publisher can use multiple certificates, reserving an individual certificate’s use to a particular compiler. (so CoApp as a publisher will have certificates for signing packages for both  VC9 and VC10 binaries)

In order for the consuming application to specify what library it is looking for, its manifest lists the certificate thumbprint, the name of the library and the version.

Hey, rather than commenting here, come join mailing list (join the team at https://launchpad.net/~coapp-developers) and continue the conversation!




The 90 second description on how CoApp packages will get built.

clock April 13, 2010 08:14 by author Garrett Serack

So,

I’ve been taking questions as to how CoApp packages get built.

Lemme see if I can sketch out the vision for you, so that you get an idea of where it’s going. This isn’t set in stone, but I’ve actually validated this is a workable solution.

 

And, before we get too far, let me make this exceedingly clear: This is ONE METHOD to generate a package that conforms to the CoApp package specification. CoApp packages do not have to be built by the tools described here, but merely conform to the spec.

And, I should mention that yes, tools like CMake may be possible to bring into the mix. Doing so, is not the shortest path, but it may provide additional benefits in the longer run, so I’m not ruling out it’s involvement.

 

Let’s say I want to create a library package for zlib.

First, I’m going to import the zlib source code into a Bazaar in a new CoApp sub-project on Launchpad.

Checking out from there, I’ll first see if the project can be compiled at all using MSVC (any version).  If it has an older project file, I’ll load it up in Visual Studio 10, and let it upgrade the project, and I’ll save it.

Drop back to the command line.

The SCANTOOL file can be pointed to the source directory to scan thru all the source files and build files to generate some intelligence about the project as a whole. It gets a list of all source files (C,C++,.H, etc), potential conditional defines present in the source (#define FOO …) and identifies what additional files are present in the project (for which we’ll have to determine what to do with them (delete, include in final as resources, ???). SCANTOOL dumps all of this data into an XML intelligence file for the project.

Build the project (either by the makefile, the vcprojx file, or whatever means necessary). When doing so however,  use the TRACE tool to watch the library get built. TRACE creates an XML file with every file access, write, read, delete and every command line for the build process and all its child processes.

At this point the developer can create a hand-made intelligence file as well for things that are known about the project (what targets are desired, etc).

The intelligence files and the trace data are fed into another tool MKSPEC, which creates a set of .spec files, each of which describes a binary output desired from the project (a .LIB , .DLL, .EXE, etc) and lists the files needed, conditional #defines, and other options. (this is essentially a compiler-neutral way of representing what is needed to build a particular output)

Each .spec file is then fed into MKPROJECT which will generate a VC10 project file. Plugins for MKPROJECT can trivially build other types of project files for things like VC9, make files for MinGW or CMake files for the CMake faithful. MKProject also ties together a collection of project files into a .SLN file for Visual Studio. Outputs are normalized for naming conventions.

The .SLN file is fed into Visual Studio (or MSBuild, the command line tool) and it compiles up the binaries.  (I’ve got a plan for PGO as well, [profile guided optimization], but I’m going to ignore that right now)

The binaries are fed into a tool called SMARTMANIFEST which creates .manifest and policy files for the library and binds them to any .DLLs and .EXEs created.

The binaries (and manifest data) along with the project source code and build files are fed into MKPACKAGE which uses WiX to build MSI files for each binary, along with a source MSI with just the necessary files to rebuild the binaries (source, vcxproj, sln). 

At that point the developer can identify what files can be trimmed from the source tree, and the whole thing can be updated in Bazaar.

http://twitpic.com/rqmo5 -- a flowchart of what I just described. Well, without TRACE.

(there’s a lot more detail to be sure, but that’s the gist of it)

Hey, rather than commenting here, come join mailing list (join the team at https://launchpad.net/~coapp-developers) and continue the conversation!




What’s this ‘CoApp’ all about?

clock April 7, 2010 09:30 by author Garrett Serack

Last week, I blog’d about a new open source project that I’ve launched called “CoApp” (The Common Opensource Application Publishing Platform). As I’ve mentioned on the project site, “CoApp aims to create a vibrant Open Source ecosystem on Windows by providing the technologies needed to build a complete community-driven Package Management System, along with tools to enable developers to take advantage of features of the Windows platform.”

Ugh—a mouthful—and all chocked full of them shiny marketing words.(Uh.. yeah, I know wrote that.).

 

So, what does that mean?

Well, while Windows provides some pretty good stuff for packaging applications in the form of Windows Installer* technology (aka “MSI”), the down side is that the open source community hasn’t really picked it up in the same way that they have picked up packaging on other platforms where they create repositories and distributions of software, and so we’re missing out on having these nice, consistent collections of all these great open source apps.  That’s where I really want to be.

‘Course, my pappy always used to tell me “it don’t take a genius to spot a goat in flock of sheep” … Sure, it’s easy to see what the problem is, question is, how do we go about fixin’ it?

 

Last fall, I started to sketch out what that should look like, and what it would take to get there.  After a few months of poking the right people, I started to get agreement here at Microsoft that this really is a great idea, and we should be spending time on it. (And, ‘course, by ‘we’ I mean ‘me’)  I know from personal experience with building open source software on Windows, that things are not only sometimes tricky, but often downright impossible to build correctly, and even harder to make sure the software is built in such a way that anyone on Windows could use it.  I’ve come up with a plan for building a set of tools to help open source software build better on Windows, along with automating the packaging in such a way that will allow us to build yet more shiny tools to locate and install them.

Along with the tools, we’re going to need to lay down some guidance on how to use them to build packages that play nice with each other—I want to make sure that I’m never running into “DLL Hell”, never having to search for missin' bits, and always getting the right package for the right job.  At the same time, I really want to use some optimization techniques to help open source software run better on Windows.

 

Starting with ApacheCon last fall, I began to reach out to people I know in open source communities, not only to get their buy-in that this is a good idea, but solicit their help. I’ve already secured a handful of folks who are interested in helping, and I can always use a few more.

Over the course of the next month or so, we’ll be the filling in the details on what all of this looks like on the project site, and discussing the merits on the mailing list. From there, we’ll begin to build the tools, and with a bit of luck, we’ll start producing packages a few more months after that. We’ll probably start with the packages that make the most sense (Apache, PHP and Python) and work our way out from there.

 

And just how does Microsoft fit into all of this?

Well, the folks here at Microsoft have recognized the value in this project—and have kindly offered to let me work on it full-time.  I’m running the project; Microsoft is supporting my efforts in this 100%. The design is entirely the work of myself and the CoApp community, I don’t have to vet it with anyone inside the company. This really makes my job a dream job—I get to work on a project that I’m passionate about, make it open source, and let it take me where it makes sense.

 

Sure, it’s a large project, but I’m pretty sure that we’re headed in the right direction—if you’d like to come out and help (or even just come get more details about what I’m talking about), you can start at http://coapp.org.

 


* I know, some people don’t particularly like MSI, but trust me, it’s all in how it’s used—ya don’t blame the horse for throwin’ a shoe.





The Common Opensource Application Publishing Platform (CoApp)

clock March 31, 2010 13:13 by author Garrett Serack

Listen up folks, this stuff is big.

Today, I’m announcing the beginning of a project that intends to bring a little joy into the hearts of Open Source aficionados on the Windows Platform.

The biggest challenge to using/building/maintaining many Open Source applications on Windows, is that Windows does a lot of things differently than Linux and Unix . Different filesystems, command lines, APIs, user experiences … well, pretty much everything. Regardless of personal opinions about it being the ‘right-way’ or ‘wrong-way’, it suffices to say that it is just simply different. 

In order to build an Open Source application like PHP for Windows from scratch, I need to have a collection of libraries created from a fair number of different projects.  This creates a dependency between the code that I’m working on—PHP—and the project that supplies the library that I need.  It’s pretty important that I not simply rely upon a previously compiled version of the library (provided either by the project itself, or a third party) for a number of reasons:

  • I want to make sure that the library is compiled with the same version of the compiler and libraries as I use.
  • In order to fine-tune performance, I’m going to need to change the compiler settings.
  • As a security precaution against malicious third parties creating flawed binaries.
  • Hey!--It’s Open source. It’s pretty much a moral imperative that I compile the code for myself. Well, it is for me anyway.

Now, unfortunately, those dependencies don’t necessarily share the same development environments, practices, tools, operating systems, or even ideas as to how things should—from one’s own perspective—be done (because, as every developer knows, one’s own way is the ‘one true way’).

Interestingly, this problem really doesn’t happen on Linux (and other *NIX-like substances).  When someone builds that same application (PHP) on Unix, they do so knowing that the OS works a certain way (generally speaking), and along with the dark magic known as autoconf, you can put the source code on nearly any Unix-variant and just build it.

  Let me take a moment to talk about how this is done in the Linux/Unix world.  This isn’t nearly a problem there because nearly all libraries come with a ‘configure’ script of sorts which the developer runs prior to building the code, and the script checks the local development environment, determines the appropriate settings, compilers and dependencies, and creates a build script to match. You download the source, unpack it, run ./configure, make && make install.  If you are missing any dependencies, you download them, unpack, run ./configure && make && make install, and go back to the app.

Shared Libraries end up in a common spot (/usr/lib), header files end up in a common spot (/usr/include) and binaries can go into a common spot (/usr/bin).

There are some tools and conventions that make this all work pretty darn good, and when it doesn't, it's usually not much of a stretch to get it there.
 

When that same application needs to be built on Windows, it takes some effort. Finding the dependencies (like OpenSSL or zlib), and getting them to compile (which is inconsistent from library-to-library on Windows) and then building the application itself—again, inconsistent—generates a binary that you can run. Nearly all of the time, if someone posts those binaries, they bundle up their copies of the shared libraries along with the application.  The trouble is, that there is no common versioning, or really, sharing of shared libraries on Windows. If your app and my app both use the same library, they could (and often do) ship with a different version of it.

And, there is the user side of the equation…

Of course. Consumers of open source software on Windows have been relegated to manually scouring the Internet for binaries, and they are often out-of-date, compiled against older compilers and libraries, and pretty hard to get working. Clearly there is a strong need for a package management system, along the same lines as apt, rpm, synaptic (and others) but built for the Windows platform, and compatible with Windows features.


Why not adapt the Unix-way on Windows?

There are two fundamental reasons: Primarily, because it’s just not done that way on Windows.  And since Windows doesn’t “look” like Unix, it’s not very easy to use the same scripts on Windows as Unix. Sure, there are Unix-like environments for Windows (Cygwin, Mingw and Microsoft’s own SUA), but they really isolate the developer from Windows itself. While they do try to create a very Unix-like environment, you end up building Unix-style apps on Windows, and pretty much forego the platform benefits that are available.

Secondly, open source software that was originally written for Windows won’t be using Linux-style tools anyway. Since I want to unify these two groups, I’m going to want a one-size-fits-all solution.

Really, the solution is to build it right—for Windows.

So, what exactly does “Building it Right” mean anyway?

That is, in a nutshell, the sixty-four kilobyte question.

For starters, this means using the tools, methodologies and technologies on Windows, as they were meant to be used, in order to take advantage of everything that Windows has to offer. I’m not interested in simply making a knock-off of the Unix-style way of doing things. Windows doesn’t store binaries in c:\usr\bin (/usr/bin) and libraries in c:\usr\lib (/usr/lib), so we’re not going to do things like that.

CoApp will:

  • Provide a distributed, community driven package management system for open source applications on the Windows Platform
  • Handle multiple versions of binaries using WinSxS (I know, even the mention of side-by-side components evokes fear, anger and the desire to go off-diet, but bear with me, I think we’ve got a solution), including multiple copies of the same version of the same library, compiled with different compilers.
  • Support 64 bit and 32 bit systems, without hassle or collisions.
  • Place binaries, libraries and header files in a logical and consistent location.
  • Have tools and methods for handling dependencies.
  • Create reliable installer packages (MSIs) for installing open source software.
  • Facilitate sharing of components and allow multiple projects to easily both participate and consume them.
  • Allow for upgrades and patching of both libraries and applications.
  • Be Windows developer friendly. No forcing of building using ‘make’, but rather taking advantage of the nifty IDEs we already have.
  • Also be Windows admin friendly. Even if it’s open source, you shouldn’t have to be a developer to put Open Source applications on Windows.
  • Use advanced optimization techniques like Profile Guided Optimization to produce optimized binaries.
  • Support future technologies as they come along.
  • Aid in the adoption of Windows Error Reporting (WinQual) to assist in making software run better on Windows.
  • End the eternal struggle between Green and Purple. Unless of course you’re a Drazi and are conducting elections.

Tall order? You bet. Still, I believe that it’s all achievable. I’ve spent the last several months working on some proof-of-concepts, fleshing out some ideas, and talking with some open source community members. Nothing is currently set in stone, and even the specifications are very fluid at this point.

I’ve started a project on Launchpad at http://Launchpad.net/coapp and the wiki at http://CoApp.org. I’m just starting the specifications and tools to make this happen, and I welcome everyone’s input and contributions.




My Open Source Moment…25 Years ago

clock February 23, 2010 09:09 by author Garrett Serack

I was thinking the other day, how long has it been since I’d first been exposed to Open Source software. Of course, the term “Open Source” hasn’t been around that long, but really, the spirit of open source software has existed for a very long time.

From a one perspective, all the source code found in all those computer magazines (Byte, Compute!, Transactor…and so many more)  that I read as a kid could be considered a form of Open Source—they published the source code, and let you play with it.  I didn’t think about it at the time, but I’m pretty sure they didn’t explicitly permit unrestricted redistribution, but I can’t say the magazines really cared one way or the other about it.

But there were those who did explicitly give their permission to redistribute it. Most often they used terms like “Public Domain” to deliberately declare that it was OK to pass the code around.

In late 1985, I became aware of VDO – the “Video Display Oriented Editor” for CP/M which was distributed as source code (ASM!) alongside a binary of the program.  It was the first text editor I ever used that supported the WordStar key bindings (ctrl-k, <key>), and between VDO and the later spiritual descendent “VDE”, I had those key bindings hardwired into my fingers. Even today, I  have an editor that uses WordStar bindings installed on Windows, and under Linux I typically install “Joe” right away.

For me, VDO was really special, because it was the first program that I didn’t actually type in, I got it as a binary along with the source code. I found I could edit the source code and re-assemble it, complete with my changes. Sure, the changes to the code were really quite minor, but I always felt that was where the power was—Making the software into what *I* needed. For me, that’s always been the most important part of open source.

Since that moment, I’ve downloaded, compiled and modified a heckova lot of software. Sometimes I give others the changes, sometimes it’s just for me. 

So, I gotta ask, when was your Open Source moment? What was the first piece of Open Source software you used? Did you play with the source?

Tag your reply on twitter: #myOSSmoment




In Flanders Fields

clock November 11, 2009 08:50 by author Garrett Serack

It's Remembrance Day.

 

I'm pausing for those who have given of themselves to protect the freedoms I enjoy.

 

In Flanders fields the poppies blow
Between the crosses, row on row
That mark our place; and in the sky
The larks, still bravely singing, fly
Scarce heard amid the guns below.

We are the Dead. Short days ago
We lived, felt dawn, saw sunset glow,
Loved and were loved, and now we lie
In Flanders fields.

Take up our quarrel with the foe:
To you from failing hands we throw
The torch; be yours to hold it high.
If ye break faith with us who die
We shall not sleep, though poppies grow
In Flanders fields.

 

Lest we forget.




Crafting an Optimized PHP Build process on Windows (Part IV)

clock June 23, 2009 19:16 by author Garrett Serack

Previously, I had discussed what it took to use PGO on the Windows PHP build. The lead to me building automated build scripts…

Automation as the root of all evil

"Anything that can be done for you, automatically, can be done to you, automatically." – David C. Wyland

First, I had to get the entire dependency stack into the mix.  While some of the dependent libraries had VCProject files, some didn't.  Worse, even if they had them, you couldn't tell with a degree of certainty that they were compiled with the same settings which would enable them to take advantage of PGO optimization.  I began taking each project, updating (or creating, using the Trace and mkProject tools) the Visual C++ project files that would use the same settings as the rest, and eventually came up with a solution file that had 74 projects in it (some of the projects generated more than one binary).

Next, I had to actually automate the process of creating the vcproject files. Once you've got the right dependencies, the PHP build process cranks out over 30 binaries when you include the PHP extensions that get built as part of the core.  After what seemed like a million compile-verify-tweak iterations, I had the tools that could generate VCProject files for the core PHP and all the extensions, provided it was all in the right place.

Next I wrote a .cmd batch script that went step-by-step, checking out the source, compiling the dependent libraries, building the PHP makefile, compiling PHP like the community did—and logging what it was doing, then switching to instrumentation, rebuilding the dependencies again, building the stack, PGO training it with test data and some applications (Wordpress, MediaWiki and phpBB) and then relinking it with optimization.

I got the .cmd script almost working, but it was fairly fragile.  At that point I decided to switch batch scripting strategies, and in about a week, rewrote the batch script in JScript, which was far more flexible, and a lot more reliable.

What's next…

"The future always arrives too fast... and in the wrong order." –Alvin Toffler

During this process, I've tweaked the build process that is generated quite a bit, added in a few more applications to the PGO training which cranks the performance up more and more. Now, I can add in more scripts to assist with the training pretty trivially, but it still takes some effort to package up an entire application like MediaWiki or Wordpress and include it into the build process.  Even once I've added in an application, I end up doing a whole slew of comparative testing to see what impact it has on the final executables.

As time goes forward, I'm sure there's more tweaking to be done, but in all likelihood, any significant performance gains are going to be the result some modification of the PHP codebase itself.




Crafting an Optimized PHP Build Process on Windows (Part III)

clock June 18, 2009 14:18 by author Garrett Serack

Previously, I had talked about using PGO in the PHP build process. In order to use it I had to observe…

The Heisenberg build process

"A process cannot be understood by stopping it. Understanding must move with the flow of the process, must join it and flow with it." – The First law of Mentat, quoted by Paul Atreides to Reverend Mother Gaius Helen Mohiam

Really, what I needed was a tool in two parts. The first would watch what happens during the build process, and the second would take that data and spit out some .vcproj files.

When I want to see what's happening on my own system I use ProcMon—a Sysinternals tool that monitors processes, what files they touch, what commands get executed, etc. I grabbed that and tried to watch what happens when you run NMake on the makefile when building PHP. It turns out that are a few problems with that—ProcMon isn't very scriptable (making it tricky to automate) and even if it was, it has problems chopping off the command line in its log files when it's past a certain length.

I found nothing else that did quite what I needed, so I started thinking about how to write a tool that does the same thing.  In the past I have used Detours (an API detouring library built by Microsoft Research) to build a couple quick-and-dirty snoop/debugging tools.  Starting with a sample that came from the Detours library, I cobbled together a tool that would watch a process and its children, recording every file written or read, every command issued, and dump it into an XML file which I could process later.

Creating the project files

At the same time, I began working on a tool that would generate .vcproj files from the data gathered during the make process. I first tried just putting together a tool which assembled the .vcproj XML file from what I knew about the layout of the project file, but as the build got trickier, the xml was getting harder to make sure it came out the way that Visual Studio expected.  I turned to the Visual Studio SDK to see if there are any COM objects I could use to manipulate project files—there were, but they aren't documented in great detail, and they were really designed to be used to inside Visual Studio for automation. Having scoured the planet, I found some examples of using the VCProjectEngine to generate project files.

For a couple of weeks solid, I worked on the tool to generate project files, compiling, testing, tweaking, etc.  I finally reached a point where I generated a project file completely that would compile the php.exe and php5.dll . Having finally arrived at this point, I built PHP using PGO instrumentation, ran the bench.php script from the PHP source directory, and then re-linked the project. This first time, I saw about an 18% improvement in speed over the previous version!

That moment

"It ain't over 'til it's over, and maybe not then, either. " – Slovotsky's Law #29

Well, as anyone who's done software development will tell you, there's the moment when you finally get your program to do what you want under very controlled conditions, and then—quite some time later—there's the moment that you can give the fruits of that labor to someone else so they can do the same thing.

Now that I had passed the point where I'd finally proven that it was worth the effort to build a PGO-optimized version of PHP, I had to get it scripted so that it could be done in an automated fashion, not just on my computer, or a computer in our Lab.

In the final part, I wrap up with the automation of the build and look to where we might go next in PHP.




Crafting an Optimized PHP Build Process on Windows (part II)

clock June 12, 2009 12:12 by author Garrett Serack

I had talked about getting started in building the PHP stack last time, now I’m taking it…

One step further

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." – Donald Knuth

A chance conversation I had last summer at OSCON with Trent Nelson—who was building Python on Windows—had planted the seeds of how to get PHP on Windows optimized further.  Trent was using the PGO features of Visual Studio to generate Python binaries that run faster.  Rather than spend a lot of time optimizing all the little bits of PHP itself, I thought that this would be an ideal way to improve the overall speed of PHP, provided I could find the right scenarios to train PHP with.  Little did I know that finding the right scenarios wasn't the hardest part.

  What is PGO? (from Wikipedia)
Profile-guided optimization (PGO) is a
compiler optimization technique in computer programming to improve program runtime performance. In contrast to traditional optimization techniques that solely use the source code, PGO uses the results of test runs of the instrumented program to optimize the final generated code. The compiler is used to access data from a sample run of the program across a representative input set. The data indicates which areas of the program are executed more frequently, and which areas are executed less frequently. All optimizations benefit from profile-guided feedback because they are less reliant on heuristics when making compilation decisions.
Adding PGO to the existing build process

"I have not failed, I've just found 10,000 ways that won't work." – Thomas Edison

I had downloaded the source to the dependent libraries off the PHP wiki, checked out the PHP source code, and began the process of adding in PGO support to the existing build process. This proved to be extremely difficult.  Even limiting the scope to just the core of PHP itself—without the dependent libraries, I ran trouble trying to compile using PGO instrumentation and then re-linking after running some tests.  The make file that gets generated by the configure.js script (a JScript version of the automake configure script for the Windows platform) was just not built with what I had in mind.

I spent the better part of two weeks trying different approaches to tweaking the makefile so that I could use PGO to improve the PHP executable, but I kept running into roadblocks.  Worse, the closer I got to a makefile that did that I wanted, the farther away from the current build process I was getting, and I wasn't sure that what I would end up with would even be close to what was being built today.

The long dark winter road

"Only the meek get pinched. The bold survive." – Ferris Bueller

I came to the conclusion that I'd have to build new Visual Studio project files from scratch.  What worried me is that this would end up to be a completely different build process and I'd never get the community to abandon what was already working, so I'd better be able to rebuild these new project files easily.  I started looking (inside Microsoft and out) for any tools which generated Visual C++ project files.  I found someone internally who had used some JScript to create project files from text files, but after some experimentation, I found this was nowhere near what I needed.  What I really needed was a way to convert the generated Makefile into a .vcproj file—and not just 'wrap' it.

Once I found there was no such tool* , I began trying to figure out how to create one. I had this idea a few times in the last decade or so: watch how a program was compiled, and create a project file that does the same thing. Having tossed around the idea in my head before, I knew it wasn't going to be trivial, but without it, I couldn't do what needed to be done.

* Let me tell you: you never want to think about writing a tool to parse out what a makefile does.  It's rather like making a tool that tells you how sausage is made, in excruciating detail. Ugh.

In Part III, I’ll talk about the trouble with observing the build process.




Crafting an Optimized PHP Build Process on Windows (Part I)

clock June 9, 2009 15:18 by author Garrett Serack

The last several months, I’ve been working very deeply with PHP—specifically—compiling the PHP core itself, and looking for avenues for optimization. This is the first of four posts about the journey I’ve been on with PHP.


I get started building PHP

"It is a bad plan that admits of no modification" – Publilius Syrus

I started working with building PHP itself about a year ago. Initially, I was trying to put together an environment to compile up the PHP stack so that I could do some debugging, and track down a few faults that we were encountering in some of the PHP applications that we were trying to modify to use the SQL Server PHP driver that the SQL Server team here at Microsoft was creating.

Once I began to work with the source code, I found out very quickly that on top of having a hard time recreating the exact same binaries that the community build process generated, there were a large number of dependent libraries that were available in binary-only form which were kept in a zip file that was passed around from developer to developer. That seemed a little odd for an open-source project but I can certainly understand that over time, unless someone is working hard to keep it all together, these things happen.

Around the same time, the community had started to invest a time and effort to 'clean up' the dependencies for building PHP on Windows, and move towards supporting VC9 (Visual Studio 2008) as an officially supported compiler.

In order to help in this process, I built out some testing environments in our Lab, which would let me compile up PHP on Windows and Linux, in order to get decent and reliable test results which we could use to identify any shortcomings that we could address. This includes benchmarking not just the core PHP executable, but replicable and comparable testing of PHP applications such as Wordpress, MediaWiki, Gallery and phpBB.

PHP 5.3 on Windows: Not your father's PHP

"I'm looking for a lot of men who have an infinite capacity to not know what can't be done." – Henry Ford

For PHP 5.3, Pierre (and others) had gone out and found up-to-date versions of all the dependencies, brought them together, and managed to get them compiling with VC6 and VC9.  They had posted these in binary and source form to the PHP Windows Internals site, which allows anyone to rebuild the PHP stack on Windows, and theoretically, get the same results as the 'official' build.

Jumping in at that point was much easier than it had been, as all you had to do was download the binaries of the libraries, check out the source code, and run a few commands at the command line, and presto you had your PHP executables. 

At this point Pierre and I played around with the build flags on VC9 and found some settings that gave some pretty significant improvements to the speed of PHP vs. the speed of the VC6 version—and a lot of speed improvements to vs. the old 5.2x line of PHP.

In Part II, I’ll talk about the going one step further with optimization.





The Cowboy

What I'm Tweetering about...

 

follow me on Twitter

Calendar

<<  September 2010  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Sign in