My purity you stole

Yesterday, I wrote about how $f() variable-function syntax works in PHP. While it is pretty bad, it’s also the groundwork for understanding the ways in which create_function is terrible.

No, I mean besides taking a string full of code as one of its arguments.

Consider this contrived example:

1
2
3
4
5
6
7
8
9
10
11
<?php
$input = array(1, 2, 3);
 
$monolithic_dimensions = array_map(
    create_function('$elem', 'return $elem * $elem;'),
    $input
);
 
/* Prints Array([0] => 1, [1] => 4, [2] => 9) */
print_r($monolithic_dimensions);
?>

On the face of it, this looks like a decent approximation of functional programming idiom. However, every time PHP executes create_function, it adds a new function to the global function table.

What’s really happening is a little easier to see if you look at create_function’s return value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?php
 
$square = create_function('$elem', 'return $elem * $elem;');
 
print gettype($square) . "\n"; /* Prints string */
 
/* Prints
0  []
108	[l]
97	[a]
109	[m]
98	[b]
100	[d]
97	[a]
95	[_]
49	[1]
*/
for ($i = 0; $i < strlen($square); $i++) {
    print ord($square[$i]) . "\t[" . $square[$i] . "]\n";
}
 
?>

Yup. PHP is creating a function at runtime, with an “impossible” name (the zero byte isn’t allowed in function statements) to avoid collisions with user-defined functions.

5 Responses to “My purity you stole”

  1. Grumqa says:

    Doesn’t C++ sort of do the same thing with virtual functions? What would you have it do instead?

    I’d be curious to know if it creates a new function every time it is called with the same arguments.

    • Owen Jacobson says:

      No; C++’s virtual function tables are entirely generated at compile time, and the names are real symbols (well, fixed offsets) rather than string literals. This is a little easier to see in Objective-C, where the symbol table is in easy reach, but you can model C++’s behaviour with this C:

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      46
      47
      48
      49
      50
      51
      52
      53
      54
      55
      56
      57
      58
      59
      60
      61
      62
      63
      64
      65
      66
      67
      68
      69
      70
      71
      72
      73
      74
      75
      76
      77
      78
      79
      80
      81
      82
      83
      84
      85
      86
      87
      88
      89
      90
      91
      92
      93
      94
      95
      96
      97
      98
      99
      100
      101
      102
      103
      104
      105
      106
      107
      108
      109
      110
      111
      112
      113
      114
      115
      116
      117
      
      #include <stdio.h>
       
      /*
      class a {
      public:
          const char *get_message();
          a(const char *);
      private:
          const char *message;
      };
      */
       
      struct a {
          const char *message;
      };
       
      struct a_virt {
          const char *(*get_message)(
              struct a *this,
              const struct a_virt *this_vtable
          );
          void (*__constructor__)(
              struct a *this,
              const struct a_virt *this_vtable,
              const char *message
          );
      };
       
      /* const char *a::get_message() */
      const char *a__get_message(
          struct a *this,
          const struct a_virt *this_vtable
      ) {
          return this->message;
      }
       
      /* a::a(const char*) */
      void a____constructor__(
          struct a *this,
          const struct a_virt *this_vtable,
          const char *message
      ) {
          /* No base class to delegate to. */
          this->message = message;
      }
       
      /* The actual virtual table for class a */
      const struct a_virt a_virt = {a__get_message, a____constructor__};
       
      /*
      class b : public a {
      public:
          b(const char *);
          void print();
      private:
          const char *message;
      };
      */
       
      struct b {
          /* Fields inherited from type 'a' -- taking advantage
             of an oddity of the C structure padding rules, this
             is even legal as-is. Just ill-advised. */
          const char *message;
       
          /* No new fields in 'b'. */
      };
       
      struct b_virt {
          /* Methods inherited from a */
          const char *(*get_message)(
              struct a *this,
              const struct a_virt *this_vtable
          );
       
          /* Local methods. */
          void (*__constructor__)(
              struct b *this,
              const struct b_virt *this_vtable,
              const char *message
          );
          void (*print)(
              struct b *this,
              const struct b_virt *this_vtable
          );
      };
       
      void b____constructor__(
          struct b *this,
          const struct b_virt *this_vtable,
          const char *message
      ) {
          /* Chain to a::a */
          a____constructor__(this, this_vtable, message);
      }
       
      void b__print(struct b *this, const struct b_virt *this_vtable) {
          /* Call this->get_message() */
          const char *message = this_vtable->get_message(this, this_vtable);
          printf("%s\n", message);
      }
       
      /* The actual vtable for type b */
      const struct b_virt b_virt = {
          a__get_message,
          b____constructor__,
          b__print
      };
       
      int main() {
          /* Create a b */
          struct b instance_of_b;
          b_virt.__constructor__(&instance_of_b, &b_virt, "Hello, world.");
       
          /* Call b.print() */
          b_virt.print(&instance_of_b, &b_virt);
      }
      $ ./virtual-demo
      Hello, world.
      

      All of this happens at compile time.

      As for the second part of your question:

      1
      2
      3
      4
      5
      6
      7
      8
      
      <?php
       
      $x = create_function('', 'return 0;');
      $y = create_function('', 'return 0;');
       
      print $x == $y; /* Prints nothing (FALSE). */
       
      ?>

      So yes, it does re-create the function every time it’s called, even if you give it the exact same code. This doesn’t really surprise me; caching function bodies is the sort of advanced feature that needs to be implemented differently in every program.

    • Owen Jacobson says:

      For some ideas on how to abuse this in Objective-C, consider the links under isa swizzling. (Objective-C calls the pointer to an object’s virtual method table isa.)

  2. Grumqa says:

    So does it just purge the function table now and then? It must be doing something so as not to blow up the server because somebody hit a page that creates a function too many times.

    • Owen Jacobson says:

      Well, see, it’s not a big deal, because the interpreter starts, renders one page, and exits (destroying everything).

      Hope you didn’t want to write a long-running program that isn’t a web page in it.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped=""> (Syntax highlighting)