Allow me to share some of my understanding....
In the example you are offering, to some degree, yes we can say 1 item is twice as large as the other. However, this is comparing fewer items. There is neither contradiction nor inconsistency with just these few items.
It also depends on how the time sizes items.
Another team may say --
Small = 1
Medium = 5
Large = 8
Now, when it comes to planning and the team is forecasting a numerical value of 4, let's consider your example.
User Story 1 = 1
User Story 2 = 5
User Story 3 = 5
User Story 4 = 1
User Story 5 = 1
User Story 6 = 8
What do you think the Dev Team will pull during Planning?
(we can all figure this out, but its just to think about something else.)
On another note, I would like to bring awareness to other ways of sizing such as Fibonacci -- http://en.wikipedia.org/wiki/Fibonacci_number
If you notice, as things get larger, there isn't a clear way of telling what is "twice" or "thrice" as large as the other, etc. This is not to suggest your Dev Team should be using this.
Is the Dev Team using just 3 sizes right now? How many Sprints have elapsed?